5/28-31st/2024 - Server Format UPDATE!
- Atom 5ive
- May 28, 2024
- 6 min read
Updated: May 31, 2024
See the bottom for up to the hour updates!
Story of my life. Check double check, then when you go to deploy, you run into an issue. When you go to see how to fix that issue, you realize three other issue need to be resolved.
Currently stuck, I have two decisions, rebuild the server "the old way" again, to get it back up and going or continue to troubleshoot my current issue. I've decided on the ladder. This means the server is to have extended downtime.
To be more specific, on a high level, the red part isn't talking to the orange part.
The server isn't seeing the cable or card that was installed to the red part. So I've been researching since 4pm today on how to fix this, and man have I done a lot. I'm not giving up. I won't restore this to its old state just to break it all down again when I find a fix. Please bear with me as I will get this up and working as fast as I can. Technical details of where I'm stuck: Server and Hardware Details:
Server: Dell PowerEdge R730XD
JBOD: NetApp DS4246
HBA Card: Sun Oracle 7047853 8-port 6Gbps SAS-2 HBA SGX-SAS6
Cables: Two CAB-QSFP to SFF-8088 Hybrid Mini SAS Twinax Copper Passive Hybrid Mini SAS Cables
Current Setup:
The Sun Oracle HBA card is installed in the Dell PowerEdge R730XD server.
The hybrid mini SAS cables connect the HBA card to the NetApp DS4246 JBOD.
Issue:
The Dell server does not recognize the RAID card via BIOS or iDRAC.
The internal LED on the HBA card is blinking green slowly.
The link light on the JBOD port is inactive.
Steps Taken and Observations:
Swapped Cables:
I switched the hybrid mini SAS cables to rule out any cable-related issues. Despite this, the server still did not recognize the RAID card, and the JBOD link light remained inactive.
Tried Another Identical HBA Card:
I used a second, identical Sun Oracle HBA card to check if the first card was defective. Unfortunately, the server still did not detect the RAID card, indicating the problem might not be with the HBA card itself.
Tested Different PCI Slots:
I installed the HBA card in various known working PCI slots within the server to ensure there was no issue with the PCI slot. This did not resolve the problem, as the RAID card was still not recognized.
Updated Firmware:
The server is running the latest version of Dell's firmware. I confirmed this to rule out any compatibility issues due to outdated firmware.
LED and Link Light Observations:
The internal LED on the HBA card blinks green slowly, suggesting some activity or standby mode. However, the link light on the JBOD port is not lit, indicating a lack of communication or recognition between the devices.
Challenges and Need for Help:
Firmware Validation:
Challenge: I am unsure how to check and update the firmware for both the Sun Oracle HBA card and the NetApp DS4246 JBOD.
Need Help: Guidance on how to validate and, if necessary, update the firmware for both devices. This might help ensure they are compatible and functioning correctly.
JBOD Configuration:
Challenge: I am unclear on how to configure the NetApp DS4246 JBOD to properly recognize and communicate with the HBA card.
Need Help: Step-by-step instructions or resources on configuring the JBOD to establish a proper connection with the HBA card. This includes any necessary settings or procedures that need to be followed.
Diagnostic Tools and Methods:
Challenge: The system lacks a web front-end and outputs only to SolarWinds servers, which I prefer not to set up. I tried using PRTG Network Monitor but could not read any output ports.
Need Help: Recommendations for alternative diagnostic tools or methods to monitor and validate the connection between the server and the JBOD. Any tips on interpreting the LED signals or other diagnostics would also be appreciated.
Summary: Despite my efforts in swapping cables, trying another identical card, testing different PCI slots, and updating the server firmware, I am still unable to get the Dell PowerEdge R730XD to recognize the RAID card or establish a proper connection with the NetApp DS4246 JBOD.
Update: 5/28 10:00pm - Manually add the server cards to my Windows PC in an effort to upgrade the firmware on them to hope this fixes the connections issues.
Update: 5/28 10:30pm - That did not work, I'm now working on upgrading the firmware for the JBOD. Not easy as the website won't "just let me download the firmware" found a workaround trying now.
Update: 5/29 12:30am - Found out you need to be a "big paying customer" to update the firmware to the JBOD. So that's a no-go. A user online led me towards a custom firmware to update this HBA controller to see the drives. Trying that now.
Update: 5/29 2:04am - I got the Hypervisor to see the HBA controller! Finally!!! But it's still not seeing the bottom server. I believe it's the cables. I was provided the wrong cables by the eBay Seller that sold me the JBOD Server. I've ordered the correct cables that will arrive here on Friday. This means the server will be down till I get the correct cables. Sigh, I tried to overnight, but it wasn't an option. Till then, I will continue to try to make what I have worked with hundreds of combinations.
I can't stress to you once the below server is linked, adding drives for expansion as well the management will be 110% better. Sorry for this large of an outage, but until I can get the right cables, the server will be down. If, for whatever reason, the cables do not work. I will bring the old configs up this week to get the server back up. I just really don't want to, as it will cause more issues and make reversing things even harder.
Update: 5/29 6:26pm - I've tried two different OS hypervisors (TrueNAS Scale (Latest) and Proxmox VE 8.2.2), and both have the same issue. While both systems detect the LSI MegaRAID SAS-3 3108 controllers, neither can see the NetApp DS4246 JBOD or the three 10TB drives in the JBOD system. Updating the firmware on the Netapp JBOD is nearly impossible, as they required you to be a paying customer in order to link the custom firm that has your device's serial and paid account into the firmware update. (Because reasons)
The more and more it's pointing at either JBOD config or a simple cable issue, will know soon enough.
Update: 5/30 2:55 pm - It's alive!!!!
Well not really, server is still formatted BUT! The biggest issue has been fixed, I can now use all the drives bays below in the orange/yellow square above They're all recognized, I threw in x3 10TB drives (ignore the tray labels) in and BAM! As you can see below...
Each one, one by one, lit up green on the top right. I checked the back and BAM!
Link light baby! Meaning, the server is now "connected" directly to all the drives that will be added over time to that bay.
Now what?! Now the building starts. With the new cable fixing the issue, I can start to rebuild all the pulls and rebuild the OS in a way that will be faster and more reliable. NOW I HAVE ROOM TO EXPAND!
More updates to come! I'm hoping to have this done by morning or earlier, depending on I don't run into any more issues. Update: 5/30 6:22 pm EST - OS is set up and installed, I'm now copying all the backup data from those applications over to the new OS on the new system! This will take some time, as the server it's backed up to is slower than this server.
I estimate "if everything goes well" but 12:30am EST TONIGHT! Everything will be back up and working again. I will be taking it down once more once I know everything is solid and working, the following day, to do some clean up (Behind the server, stuff like new cables added in the back and external drives I was using to increase storage, I now no longer need as everything is internal) on the server, but it won't be downtime like this. I'll let everyone know a head of time. So, so, so, happy I'm making progress!
Update: 5/31 12:10 am EST - Final extraction of original Plex server. Soon as this is done, I will be bringing the server back up. All other services are 100% working. Just working on the Plex Application specifically.
I'm going to try and stay up to finish this. Update: 5/31 2:31 am EST - Done. I'm letting everything run overnight. But it's purring like a kitten. The server is official up and operational with no data loss.
I will be taking it down sometime, maybe tomorrow afternoon? But it will be brief, again it's just for cable management if not afternoon depending on how many people are on it will be late very late. Either way, lots in-store now with so much wiggle room, more details to come on that later. Thank you again for all of those who checked in from time to time and stuck this out with me. I wanted to get this up as fast as possible. Sorry for the downtime.








It will be back up ASAP!
you are doing a fantastic job. keep up the good work.