Monday 26 September 2011

Hardware oddities - X4450 memory issue Part 1

Not so long ago we bought an extra 16GB RAM in 4GB sticks for one of our X4450s. This would enable us to increase the memory from 32GB to 48GB to handle some of our memory hungry applications. Inserting the sticks is a not a difficult job, in fact it is quite easy but unfortunately for me I have had a bad track record mostly due to faulty DIMMs. For an X4450 the DIMMs are paired so when you open up the server you are presented with 32 neat slots. 4 of these are coloured blue and the rest are coloured black. Your largest paired DIMMs (4GB in our case) would go in to the blue slots and after that populate the rest of the black slots with the smaller DIMMs (2GB for us).
After I had finished up and powered on the server an orange light appeared indicating a fault somewhere. So again I opened the server and with it powered on, right beside slot A0 an orange light appeared. Thinking that it was a faulty DIMM I switched it with the one from slot B0. Again the orange light at slot A0 appeared yet the light at slot B0 which has now what I thought was a faulty DIMM remained off. Again I switched with the other 4GB pair (slots C0/D0) and the A0 light appeared. So now the fault is most likely the memory mezzanine which luckily can be removed and replaced easily. After opening a call with Oracle a new one was sent out. After replacing the alleged faulty mezzanine and repopulating the slots I powered up the server and found to my dismay the same A0 light. I again swapped around the DIMMs which unfortunately did not fix the issue. So at the moment there are a few options for the cause:
  1. Faulty DIMM(s)
  2. Faulty mezzanine connector
  3. Faulty motherboard
  4. Firmware issue
  5. OS has not cleared old error
After production hours I will take down the server and a spare X4450 and I will change the mezzanine and the DIMMs between them. If the A0 light appears on the spare server I can narrow it down to the DIMMs or mezzanine kit. Hopefully we will be able to use the spare servers mezzanine with the new DIMMs. I will follow with an update.

No comments:

Post a Comment