summaryrefslogtreecommitdiff
path: root/Manage.c
Commit message (Collapse)AuthorAge
* Fix readding of a readwrite drive into a writemostly arrayDoug Ledford2011-09-19
| | | | | | | | | | | | | | | | | | | | | If you create a two drive raid1 array with one device writemostly, then fail the readwrite drive, when you add a new device, it will get the writemostly bit copied out of the remaining device's superblock into it's own. You can then remove the new drive and readd it as readwrite, which will work for the readd, but it leaves the stale WriteMostly1 bit in devflags resulting in the device going back to writemostly on the next assembly. The fix is to make sure that A) when we readd a device and we might have filled the st->sb info from a running device instead of the device being readded, then clear/set the WriteMostly1 bit in the super1 struct in addition to setting the disk state (ditto for super0, but slightly different mechanism) and B) when adding a clean device to an array (when we most certainly did copy the superblock info from an existing device), then clear any writemostly bits. Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Discourage large devices from being added to 0.90 arrays.NeilBrown2011-09-08
| | | | | | | | 0.90 arrays can only use up to 4TB per device. So when a larger device is added, complain a bit. Still allow it if --force is given as there could be a valid use. Signed-off-by: NeilBrown <neilb@suse.de>
* Check all member devices in enough_fdNeilBrown2011-05-23
| | | | | | | | | | | | | | The loop over all member devices in enough_fd could easily stop before it had found all devices. This would cause --re-add to fail incorrectly. So change the loop to be based on the reported number of devices in the device - with a safe-guard limit of 1024. Change some other loops to be more careful too. Reported-by: "Schmidt, Annemarie" <Annemarie.Schmidt@stratus.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Manage: minor fix to add/re-add handling.NeilBrown2011-05-10
| | | | | | | | If using an old kernel we should still check if a re-add might be intended, so we can refuse and require a '--zero' first if it is not possible. Signed-off-by: NeilBrown <neilb@suse.de>
* Merge branch 'master' into devel-3.2NeilBrown2011-03-24
|\ | | | | | | | | | | | | | | | | | | | | | | Conflicts: Incremental.c Manage.c ReadMe.c inventory mdadm.8.in mdadm.spec mdassemble.8 mdmon.8
| * --stop: separate 'is busy' test for 'did it stop properly'.NeilBrown2011-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stopping an md array requires that there is no other user of it. However with udev and udisks and such there can be transient other users of md devices which can interfere with stopping the array. If there is a transient users, we really want "mdadm --stop" to wait a little while and retry. However if the array is genuinely in-use (e.g. mounted), then we don't want to wait at all - we want to fail immediately. So before trying to stop, re-open device with O_EXCL. If this fails then the device is probably in use, so give up. If it succeeds, but a subsequent STOP_ARRAY fails, then it is possibly a transient failure, so try again for a few seconds. Signed-off-by: NeilBrown <neilb@suse.de>
* | FIX: Add spare throws exception (v2)Adam Kwolek2011-03-20
| | | | | | | | | | | | | | | | | | | | | | sync_metadata() requires st->sb to be loaded, otherwise exception is generated. This fails expansion, because spares cannot be added. metadata update uses tst instead st pointer, it is better than loading anchor for st as I proposed previously. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Retry writing 'inactive' state during stopping arrayKrzysztof Wojcik2011-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue observed: Sporadicaly stopping arrays using "mdadm -Ss" command does not succeded. Cause: Writting "inactive" to the array state not succeded- array is busy (accessed by udev, blkid etc.) Resolution: If writing 'inactive' fails, wait and retry again (because it is possibly a transient failure) Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | FIX: ping_monitor() usage causes memory leaksAdam Kwolek2011-03-18
| | | | | | | | | | | | | | | | | | | | When for ping_monitor() input devnum2devname() is used, received string pointer should be passed to free() for memory release. It is not made in several places. This use case should have function to avoid memory leak. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Manage: fix the mess I made in earlier patch.NeilBrown2011-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When I separated the 'native metadata' case more cleanly from the "external metadata" case for adding a drive, I left some 'external' code in the 'native' case, and didn't copy it to the 'external' case. When - in the external case - we add to super, we much check for mdmon first, so we know whether to do the metadata update ourselves or not, then afterwards call either flush_metadata_updates (to send to mdmon) or sync_metadata (to do it directly). Reported-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | --stop: separate 'is busy' test for 'did it stop properly'.NeilBrown2011-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stopping an md array requires that there is no other user of it. However with udev and udisks and such there can be transient other users of md devices which can interfere with stopping the array. If there is a transient users, we really want "mdadm --stop" to wait a little while and retry. However if the array is genuinely in-use (e.g. mounted), then we don't want to wait at all - we want to fail immediately. So before trying to stop, re-open device with O_EXCL. If this fails then the device is probably in use, so give up. If it succeeds, but a subsequent STOP_ARRAY fails, then it is possibly a transient failure, so try again for a few seconds. Signed-off-by: NeilBrown <neilb@suse.de>
* | Merge branch 'master' into devel-3.2NeilBrown2011-03-15
|\| | | | | | | | | | | | | | | Conflicts: Manage.c managemon.c super-ddf.c super-intel.c
| * Manage/external: for external metadata, add_to_super needs lock on container.NeilBrown2011-03-15
| | | | | | | | | | | | | | | | | | | | add_to_super could use information from the current superblock (ddf does), so add_to_super for external metadata should be called with the O_EXCL lock held on the container to ensure the update is complete before any other process tries to make any changes (like adding another device to array). Signed-off-by: NeilBrown <neilb@suse.de>
| * Manage: be more careful about --add attempts.NeilBrown2011-03-10
| | | | | | | | | | | | | | | | | | If an --add is requested and a re-add looks promising but fails or cannot possibly succeed, then don't try the add. This avoids inadvertently turning devices into spares when an array is failed but the devices seem to actually work. Signed-off-by: NeilBrown <neilb@suse.de>
| * Fix regression with removing 'failed' and 'detached' devices.NeilBrown2011-02-15
| | | | | | | | | | | | | | | | | | | | | | | | If a request to remove all 'failed' or 'detached' devices chooses to remove the first device, it will not actually try the removal and will skip any following devices. This fixes it. Reported-by: Rémi Rérolle <rrerolle@lacie.com> Tested-by: Rémi Rérolle <rrerolle@lacie.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | modified message on failure to read metadata in ManageCzarnowska, Anna2011-02-21
| | | | | | | | | | | | | | | | | | | | | | Loading container may fail if e.g. one of the disks in container has been detached but udev has not realized the change. Addition to such array will fail because reading superblock from one of disks in array fails. Current message is a bit confusing. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Fix regression with removing 'failed' and 'detached' devices.NeilBrown2011-02-15
| | | | | | | | | | | | | | | | | | | | | | | | If a request to remove all 'failed' or 'detached' devices chooses to remove the first device, it will not actually try the removal and will skip any following devices. This fixes it. Reported-by: Rémi Rérolle <rrerolle@lacie.com> Tested-by: Rémi Rérolle <rrerolle@lacie.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Call free_super before attempting to add a new deviceNeilBrown2011-01-31
| | | | | | | | | | | | | | | | Now that write_init_super doesn't close fds any more, we need to call free_super before the ADD_NEW_DISK ioctl. Also call free_super before some error returns, for cleanliness. Signed-off-by: NeilBrown <neilb@suse.de>
* | Don't close fds in write_init_superNeilBrown2011-01-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We previously closed all 'fds' associated with an array in write_init_super .. sometimes, and sometimes at bad times. This isn't neat and free_super is a better place to close them. So make sure free_super always closes the fds that the metadata manager kept hold of, and stop closing them in write_init_super. Also add a few more calls to free_super to make sure they really do get closed. Reported-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Add spares to raid0 in mdadmAdam Kwolek2011-01-06
| | | | | | | | | | | | | | | | | | When user wants to add spares to container with raid0 arrays only it is not possible to update metadata due to lack of running mdmon. To allow for this direct metadata update by mdadm is used in such case. Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | move_spare function modified and moved to Manage.cAnna Czarnowska2011-01-05
| | | | | | | | | | | | | | It will also be needed for Incremental. Signed-off-by: Anna Czarnowska <anna.czarnowska@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Allow --update=devicesize with --re-addNeilBrown2010-12-09
| | | | | | | | | | | | | | | | | | | | This is useful with 1.1 and 1.2 metadata to update the metadata if the device size has changed. The same functionality can be achieved by writing to the device size in sysfs after re-adding normally, but in some cases this might be easier. Signed-off-by: NeilBrown <neilb@suse.de>
* | Manage: allow manual control of external raid0 readonly flagDan Williams2010-11-23
| | | | | | | | | | | | | | | | | | mdadm --readwrite <subarray> will clear the external readonly flag ('-' to '/'), but only for redudant arrays. Allow raid0 arrays as well so the user has a simple helper to control this flag. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Replace various load_super calls with load_containerNeilBrown2010-11-22
| | | | | | | | | | | | | | When we call load_super expecting to find a container, we now just call load_container directly. Signed-off-by: NeilBrown <neilb@suse.de>
* | Improve type names for mddev_devNeilBrown2010-11-22
| | | | | | | | | | | | | | | | | | Remove the _t pointer typedef and remove the _s suffix for the structure, These things do not help readability. Signed-off-by: NeilBrown <neilb@suse.de>
* | Improve mddev_ident type definitions.NeilBrown2010-11-22
| | | | | | | | | | | | | | | | Remove the _t typedef and remove the _s suffix from the struct name. These things do not help readability. Signed-off-by: NeilBrown <neilb@suse.de>
* | Pass subarray arg explicitly to ->update_subarray.NeilBrown2010-11-22
| | | | | | | | | | | | | | This is better than hiding it in the supertype structure where we are never quite sure who needs it. Signed-off-by: NeilBrown <neilb@suse.de>
* | super_by_fd: return subarray info explicitly.NeilBrown2010-11-22
| | | | | | | | | | | | | | | | | | Rather than hiding this in the 'st', return it explicitly. In the one case we still need it, copy it into st where needed. This will disappear in a future patch. Signed-off-by: NeilBrown <neilb@suse.de>
* | open_subarray: pass subarray name as explicit arg.NeilBrown2010-11-22
| | | | | | | | | | | | | | | | | | | | | | Rather than hiding this arg in the 'st' structure, pass it explicitly. This is a first step to getting rid of 'subarray' from 'supertype'. The strcpy in open_subarray should have better error checking, but it will disappear soon so there is little point. Signed-off-by: NeilBrown <neilb@suse.de.
* | get_info_super: report which other devices are thought to be working/failed.NeilBrown2010-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | To accurately detect when an array has been split and is now being recombined, we need to track which other devices each thinks is working. We should never include a device in an array if it thinks that the primary device has failed. This patch just allows get_info_super to return a list of devices and whether they are thought to be working or not. Signed-off-by: NeilBrown <neilb@suse.de>
* | Manage: be more careful about --add attempts.NeilBrown2010-11-22
|/ | | | | | | | | If an --add is requested and a re-add looks promising but fails or cannot possibly succeed, then don't try the add. This avoids inadvertently turning devices into spares when an array is failed but the devices seem to actually work. Signed-off-by: NeilBrown <neilb@suse.de>
* Fix spare migration.NeilBrown2010-08-31
| | | | | | | Spare migration uses major:minor device names. When we added support for kernel style names, we broke that. Signed-off-by: NeilBrown <neilb@suse.de>
* Don't remove md devices with standard names.NeilBrown2010-08-31
| | | | | | | | | | | | | | | | If udev is not in use, we create device in /dev when assembling arrays and remove them when stopping the array. However it may not always be correct to remove the device. If the array was started with kernel auto-detect, them mdadm didn't create anything and so shouldn't remove anything. We don't record whether we created things, so just don't remove anything with a 'standard' name. Only remove symlinks to the standard name as we almost certainly created those. Reported-by: Petre Rodan <petre.rodan@avira.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Compile with -Wextra by defaultNeilBrown2010-08-05
| | | | | | This produced lots of warning, some of which pointed to actual bugs. Signed-off-by: NeilBrown <neilb@suse.de>
* Two Minor bug fixes to incremental supportDoug Ledford2010-07-22
| | | | | | | | | | One: a single character typo (of instead of or in an error printout) Two: Audited usage of tfd file descriptor. Make sure that the tfd file is always closed after usage, and that the tfd variable is reset to -1 if we are going to continue in our loop (not necessary if we know we will return from our function without going through the dv loop again). Signed-off-by: Doug Ledford <dledford@redhat.com>
* Merge branch 'master' of git://github.com/djbw/mdadmNeilBrown2010-07-06
|\
| * Rename subarray v2Dan Williams2010-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow the name of the array stored in the metadata to be updated. In some cases the metadata format may not be able to support this rename without modifying the UUID. In these cases the request will be blocked. Otherwise we allow the rename to take place, even for active arrays. This assumes that the user understands the difference between the kernel node name, the device node symlink name, and the metadata specific name. Anticipating further need to modify subarrays in-place, introduce the ->update_subarray() superswitch method. A future potential use case is setting storage pool (spare-group) identifiers. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | Add --test option to --re-add and similarNeilBrown2010-07-06
| | | | | | | | | | | | | | | | | | | | --test can be given in Manage mode. This can be used when there is an attempt to fail or remove 'faulty', 'failed' or 'detached' devices, or to re-add 'missing' devices. If no devices were failed, removed, or re-added, then mdadm will exit with status '2'. Signed-off-by: NeilBrown <neilb@suse.de>
* | Add support for "--re-add missing"NeilBrown2010-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If the device name "missing" is given for --re-add, then mdadm will attempt to find any device which should be a member of the array but currently isn't and will --re-add it to the array. This can be useful if a device disappeared due to a cabling problem, and was then re-connected. The appropriate sequence would be mdadm /dev/mdX --fail detached mdadm /dev/mdX --remove detached mdadm /dev/mdX --re-add missing Signed-off-by: NeilBrown <neilb@suse.de>
* | Avoid skipping devices where removing all faulty/detached devices.NeilBrown2010-06-30
| | | | | | | | | | | | | | | | | | | | | | | | When using 0.90 metadata, devices can be renumbered when earlier devices are removed. So when iterating all devices looking for 'failed' or 'detached' devices, we need to re-check the same slot we checked last time to see if maybe it has a different device now. Reported-by: Jim Paris <jim@jtan.com> Resolves-Debian-Bug: 587550 Signed-off-by: NeilBrown <neilb@suse.de>
* | Add -fail support to --incrementalNeilBrown2010-06-30
| | | | | | | | | | | | | | | | | | | | | | | | | | This can be used for hot-unplug. When a device has been remove, udev can call mdadm --incremental --fail sda and mdadm will find the array holding sda and remove sda from the array. Based on code from Doug Ledford <dledford@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Support fail/remove using kernel nameNeilBrown2010-06-30
|/ | | | | | | | | | Allow kernel names like "sda" and "hdb1" to be used to fail/remove devices from an array. This is useful as after a device has been removed it can be difficult to get the major/minor number. Signed-off-by: NeilBrown <neilb@suse.de>
* Create: cleanup after failed create in duplicated array member caseDan Williams2010-04-19
| | | | | | | | | | | | | | | | mdadm prevents creation when device names are duplicated on the command line, but leaves the partially created array intact. Detect this case in the error code from add_to_super() and cleanup the partially created array. The imsm handler is updated to report this conflict in add_to_super_imsm_volume(). Note that since neither mdmon, nor userspace for that matter, ever saw an active array we only need to perform a subset of the cleanup actions. So call ioctl(STOP_ARRAY) directly and arrange for Create() to cleanup the map file rather than calling Manage_runstop(). Reported-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* Stop: done stop a container which still have members active.NeilBrown2010-03-09
| | | | | | Doing that is just confusing... Signed-off-by: NeilBrown <neilb@suse.de>
* Manage: fix regression on removing detached devices.NeilBrown2010-03-03
| | | | | | | | | | | | If /dev is static, a name may remain there after the device has been detached from the system. Using 'mdadm' to remove such a device from the array should still work (even though "mdadm --remove detached" might be preferred). So when processing a device for '-r', don't insist on being able to open the device. Signed-off-by: NeilBrown <neilb@suse.de>
* Merge branch 'master' of git://github.com/djbw/mdadmNeilBrown2009-12-30
|\
| * Support external metadata recovery-resumeDan Williams2009-12-21
| | | | | | | | | | | | | | | | | | Minimal changes needed to permit reassembling partially recovered external metadata arrays. The biggest logical change is that ->container_content() can now surface partially rebuilt members rather than omitting them from the disk list. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
| * Teach sysfs_add_disk() callers to use ->recovery_start versus 'insync' parameterDan Williams2009-12-21
| | | | | | | | | | | | Also fixup 'in_sync' versus 'insync' typo. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | Don't attempt a re-add if the device is marked as faulty.NeilBrown2009-12-08
|/ | | | | | | | If a device is marked as faulty, then a re-add will cause it to be added as a faulty drive, which is not what it wanted. So just refuse to try to re-add a device which is marked 'faulty'. Signed-off-by: NeilBrown <neilb@suse.de>
* Don't silently map --re-add to --addNeilBrown2009-11-17
| | | | | | | | | As --add can destroy important data on a disk, and --re-add is not suppose to, it is wrong to silently try --add if --re-add fails. So print a message and abort instead. Signed-off-by: NeilBrown <neilb@suse.de>