0

Please help me, this is very very important.

I have a CentOS server running properly for a month. It was installed with LVM.

Today I found that I could not connect to the MongoDB and failed to restart the service. In addition, when I type su root to switch user, I waited for a long time but it didn't ask for my password. I tried to copy the mongodb log to my home directory but it said error reading 'mongod.log': Input/output error

So I decided to reboot the server to see if it may help.

Then the server automatically entered the emergency mode: enter image description here

Then I tried these commands to see what happened: enter image description here

Does it mean that my SSD (WD SN750) is dead? The server functioned properly four days ago and before I reboot I still can do some IO operations in other folders. However, it can just enter emergency mode now....

How can I fix this? There are important data about my working paper on it.

Tom Leung
  • 101

1 Answers1

0

Boot from any rescue media (Live CD/PXE/whatever) and run and post the result of:

smartctl -A /dev/sda

Consult with https://en.wikipedia.org/wiki/S.M.A.R.T. for the attributes which might be failing. Normally the most important is: Reallocated Sectors Count (must be relatively low).

Then you can mount your old filesystem somewhere e.g. /mnt/system and inspect your logs. I presume you have at least CentOS 7 which uses journald, so depending on whether your drive is SATA or NVME you could simply check if there are any I/O errors by running:

journald -D /mnt/system/var/log/journald | egrep 'sd|nvme'

I would not recommend running badblocks if your disk is indeed failing because doing so may damage even more data. First try to use ddrescue to retrieve the data, then do anything else.

  • I turn on my server today and everything just works fine. I have no idea. Could you provide some possible causes for this situation? BTW, the SMART information about RSC is 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 – Tom Leung Jan 29 '21 at 05:39
  • Post the complete smartctl -a /dev/device output after running smartctl -t long /dev/device – Artem S. Tashkinov Jan 29 '21 at 10:03