http://aa11.cjb.net/hpux_admin/1997/0249.html
I asked "which is better (and WHY?): find|cpio or tar for copying a disk to a slightly smaller drive."
About 15 replies have arrived, most of which (a) pointed out my lack of "-depth" in my example. Valid point. mea culpa. -depth will preserve the time-stamps of directories. Two folks also added the "u" option… immaterial in my case, since i was going to a blank disk.
Mark Jones at Motorola suggested the obvious test (not done due to other time pressures) and replied that tar cannot handle very long pathnames, which -is- a very good reason to use the find|cpio method.
> From: Mark Jones <mjones@pencom.com> > To answer your question, try both with the time command: > time tar -cf - * |(cd /other_place; tar -xf - ) > time find . -xdev -depth -print | cpio -pmdux /other_place > > Tar has a limitation on the number of characters it can > read in a path. Since we have long paths here, we always > use cpio. I don't know if that limitated was fixed > with 10.X. We are still running 9.05.
(as it happened, my copy-over was on a v9.01 system)
Tom Coates (tom_coates@trimble.com) preferred find|cpio for these reasons:
> I doubt there would be much difference in speed, since most of the > time is probably taken up with just transfering the data. This > seems even more likely since you are copying an entire disk. > > I've always used find|cpio, as it gives very good control over > preserving file modification dates, etc. Also, once you get good > at writing find commands, you can filter the copied files to get > only what you want. I've had fits in the past with tar, trying to get > it to copy several trees from different locations, involving symbolic > links into a single new location. Probably the nicest feature of > find|cpio is that you can work out the find command first, to see > exactly what is going to be copied. Then you just repeat the command > piped to cpio to do the actual copying.
Chris Marble (chris_marble@hmc.edu) suggested adding the "-depth", and said:
> I think it's just personal taste. I always recommend the > cpio command and have been posting it for about 2 1/2 years when > anyone asks. On my SGI systems I use dump and restore to copy disks. > I think the cpio could handle CDFs (Cluster Dependant Files) > better than anything else. But CDFs don't expst with HP-UX 10.
Tony Kruse (akruse1@ford.com) suggested a third method (which had crossed my mind, but not in the multi-reader sense he mentions):
> I always use > fbackup -c /usr/adm/fbackupfiles/backup.config -i /usr -f - | \ > (cd /mnt; frecover -Xrf -) > since I can specify 6 filesystem readers to keep the 1 fbackupwriter > busy in my backup.config file.
One reader didn't notice that i was going to a -smaller- disk, and suggested "dd", and another warned me that tar might not do symbolic links (it does).
For what it's worth, my find|cpio of a 1.8-gigs-used disk on a 9000/710 v9.01 took about 1.5 hours. (Digital Equip DSP3210 to HP C2490A)
Something find|cpio did NOT do "properly" was a large number of instances where it couldn't/wouldn't set the file's group to match the original. Usually the old group was "other", and it ended up "sys". Tar slavishly sets the owner/group to their original numeric values, even if those don't exist (if a cross-machine operation).
dick seymour
p.s.: in OpenVMS the command to use is "BACKUP/image in-drive out-drive "
My original posting (-depth added):
>I've got two similar, but not-exactly-equal, disks... and i want to > copy the filesystem (in this instance: not root, not needing to be bootable) > from one to the other. > >Which method is "better", and why? > > cd /old_disk ; find . -depth -print | cpio -pdxm /new_disk > > or > > cd /old_disk ; tar -cf - . | (cd /new_disk; tar -xf - ) > >In previous exercises like this, i've happily used "tar", and the > results seemed to perform properly. However i've noticed postings > here recommending the "find" route... which i'm using at the > moment to migrate users from a screeching 2.1 gig disk to a new 2.06 > gig disk. > >So, what's the difference? Which is faster? Which is fraught with peril? > "tar" would seem to expend more-than-necessary CPU cycles, since its > original goal was an archive file with internal structure, but i'm > primarily concerned with wall-clock time.