Transferring or Archiving with Large Numbers of Files
Have you ever encountered an "Argument list too long" error? For example:
$ tar -cvjf files.tbz *.tre && scp files.tbz glyphoglossus:Projects/sim20data/ -bash: /bin/tar: Argument list too long
This is caused by the limit on the number of arguments that can be passed to a command from the shell prompt. The glob, "*", expands into too many arguments for the bash shell to handle. There are a a number of ways to solve this.
[Added: Jul 28 2010] The easiest way to accomplish this is a 2-step process:
- Create a list of the files to be archived using the "
find" command:$ find . -name="*.tre" > filelist.txt
- Use the "
-T" option of the "tar" command to pass in this list of filenames:$ tar cvjf archive.tbz -T filelist.txt
Alternatively, you can use a shell loop or the "find" command.
:
$ for f in *.tre; do mv $f ~/other/dir; done
or
$ find . -name="*.tre" -exec mv {} ~/other/dir \;
However, compressing the files on an individual basis and then transfering them would result in thousands of individual zip/gz files: the extra CPU, bandwith and disk storage overhead of dealing with all those individual files, not to mention the hassle, made this a very unattractive approach.
The "tar" command has a "-r" option, which is supposed to append files to an existing archive. But I kept getting thwarted when trying to use it, until I realized that it only works with uncompressed archives. So, the correct solution is a two step approach: (1) collect all the files into one big uncompressed tarball using a shell "for" loop or "find" command, (2) and then compress the tarball:
$ for i in *.tre; do tar -f ~/scratch/manyfiles.tar -rv $i; done $ bzip2 ~/scratch/manyfiles.tar
Now the resulting file "~/scratch/manyfiles.tar.bz2" can be transferred in one big, but compressed, secure copy with impunity.
The creation of a list of files and the use of the "-T" option of "tar" seems to me the best way to go about this, however!
feed
Comments
0 comments postedPost new comment