How to remove a disk from a Dell PowerEdge PERC H710-managed RAID0 array in Ubuntu without restarting

Finding the answer to this took me forever for 2 reasons:

  1. Dell has never supported Ubuntu super well – it’s “left up to the community” so relying on their official repos for installing command line utilities is a bust or outdated by a half-decade.
  2. I constantly found information on EXPANDING an array or REPLACING a failed disk, but not straight up removing a disk.

Now let’s dive into the details…

Use Case

  • Assume you have a RAID0 array with N-disks in it (in my case, N=6).
  • Assume it contains unimportant data that you don’t mind losing.
  • Assume 1 of the disks in the array is-failing/failed (in my case, flashing orange on the front of my PowerEdge).
  • Assume you don’t care about rebuilding the array with less space; it was over-capacitized anyway and you just need the array working to generate files.
  • Assume you don’t want to reboot the server and get into the BIOS PERC setup utility.
  • Assume you don’t know how to use the built in iDRAC management software.
  • Assume you want to do all of this from the Ubuntu 20.04 command line like a boss.

Requirements

Relatively current perccli toolset; I found this link from the Dell Support site for a build from 2018 which was perfect for my use case:

https://www.dell.com/support/home/en-us/drivers/driversdetails?driverid=52r3d (Direct Download Link)

But that’s an RPM file…

Right-o you are, to convert it into a simple .deb file to install on Ubuntu you need to use the alien utility, so let’s install that and convert the RPM:

sudo apt install alien
sudo alien perccli-007.0318.0000.0000-1.noarch.rpm
sudo dpkg -i perccli_007.0318.0000.0000-2_all.deb

Now you can run the perccli utility!

/opt/MegaRAID/perccli/perccli

How To

At a very high level we are going to force the disk offline, pop it out of the enclosure, destroy the Virtual Disk that represents the array, then re-create it with 1 less disk and reformat it.Unmount the drive (in my case it was /dev/sdb)

First unmount the impacted block device (this was /dev/sdb on my machine):

sudo umount /mnt/plot

Let’s hop into the perccli directory so we don’t need to type long commands for the rest of this tutorial:

cd /opt/MegaRAID/perccli/

Now let’s list off all your Virtual Drives to make sure you see the one you want to kill (this command effectively ‘shows all’ the stats for Controller-0 on the server):

sudo ./perccli /c0 show all

<SNIP>

VD LIST :
=======

-------------------------------------------------------------
DG/VD TYPE  State Access Consist Cache Cac sCC     Size Name 
-------------------------------------------------------------
0/0   RAID1 Optl  RW     Yes     RWBD  -   OFF 372.0 GB BOOT 
1/1   RAID0 Optl  RW     Yes     RWTD  -   OFF 2.725 TB      
-------------------------------------------------------------

<SNIP>

You should see your Virtual Drive listed, in my case it was VD1 (VD0 is my boot array).

To delete VD1, I did the following and it was almost too easy…

sudo ./perccli /c0/v1 del

At this point your Virtual Disk is smoked and all data is lost. Go ahead and POP the drive out of the enclosure that is failing.

Now we want to REBUILD the array, but with less disks. To see the disks we can choose from, let’s run another show command:

sudo ./perccli /c0 show all

<SNIP>

Physical Drives = 7

PD LIST :
=======

-------------------------------------------------------------------------
EID:Slt DID State DG       Size Intf Med SED PI SeSz Model            Sp 
-------------------------------------------------------------------------
32:0      0 Onln   0   372.0 GB SAS  SSD N   N  512B HUSML4040ASS60   U  
32:1      1 Onln   0   372.0 GB SAS  SSD N   N  512B HUSML4040ASS60   U  
32:3      3 Onln   1 558.375 GB SAS  HDD N   N  512B X422_HCOBD600A10 U  
32:4      4 Onln   1 558.375 GB SAS  HDD N   N  512B X422_HCOBD600A10 U  
32:5      5 Onln   1 558.375 GB SAS  HDD N   N  512B X422_HCOBD600A10 U  
32:6      6 Onln   1 558.375 GB SAS  HDD N   N  512B X422_HCOBD600A10 U  
32:7      7 Onln   1 558.375 GB SAS  HDD N   N  512B X422_HCOBD600A10 U  
-------------------------------------------------------------------------

<SNIP>

You can see that in Enclosure 32, suddenly Disk 2 is missing – that’s the one I pulled today for failing.

To create the new Virtual Disk with the remaining drives, issue the following command:

sudo ./perccli /c0 add vd type=raid0 drives=32:3-7

Important Note

  • “32” here refers to the Enclosure #, yours will be different. I have no idea how the PERC card decides this.
  • 3-7 refers to a range of the Disk ID (DID) values to specify all those disks.
  • There are like 30 arguments to this command (check official guide) but I ignored all of them for ease of setup.

After you are done, you need to format the device to create a filesystem on it. In this case, I created an XFS filesystem:

sudo mkfs.xfs -f /dev/sdb 
lsblk -f

<SNIP>

sdb    xfs            67aa46c2-9905-444e-881a-57628a83209b    2.6T 

<SNIP>

The ‘-f’ argument to mkfs.xfs just forces the format because it will find remnants of an existing filesystem and halt without it.

After the format is complete, we list the block devices and see our new RAID0 array with a fully assigned UUID!

Now we can make any adjustments to /etc/fstab that might be needed to reference the new device (mine was pointing at the old UUID path, so I had to update it) and then re-mount the directory and you are back in the game!

Leave a Comment