Remove directory structure when extracting tar archive

Tag: tar , directory-structure Author: W181307473 Date: 2011-10-12

Suppose I have created a tar archive on Server A using tar -czvf subdir.tgz subdirectory/*. If I have a parallel structure on Server B to which I've copied this tarball and I want to now clobber over that same subdirectory - including removing any files in this subdirectory that are not present in the tar archive - I could cd into the parent directory on Server B, rm -rf subdirectory and then tar -xzvf subdir.tgz.

Is there a way to do this without having to first run the rm?

As a test I put a dummy directory inside Server B at subdirectory/dummy/ and then ran tar -xzvf subdir.tgz --recursive-unlink but to my surprise, subdirectory/dummy/ was still there. I guess I misunderstood what --recursive-unlink does.

EDIT ========== here's an example. I create two directories, which would represent the two "mirrored" directories on two different servers (think of them as server 'a' and server 'b'). In the destination directory ('b') I create an extra dir and some files that will not be present in the tar archive. These I'm expecting to be clobbered and removed. Essentially, all I want in directory 'b' when I'm done is the files from directory 'a'. But, as can be seen from the example, despite --unlink-first and --recursive-unlink the superfluous directory and files are still there after untar.

-bash-3.2$ mkdir tartest
-bash-3.2$ cd tartest
-bash-3.2$ mkdir a
-bash-3.2$ mkdir b
-bash-3.2$ cd a
-bash-3.2$ mkdir xyz
-bash-3.2$ cd xyz
-bash-3.2$ touch test.1
-bash-3.2$ touch test.2
-bash-3.2$ cd ../
-bash-3.2$ touch test.3
-bash-3.2$ cd ../b
-bash-3.2$ mkdir abc
-bash-3.2$ cd abc
-bash-3.2$ touch test.4
-bash-3.2$ cd ../../a
-bash-3.2$ tar cvf ../archive.tar ./*
./test.3
./xyz/
./xyz/test.2
./xyz/test.1
-bash-3.2$ cd ../b
-bash-3.2$ tar -xf ../archive.tar --unlink-first --recursive-unlink
-bash-3.2$ ls
total 16K
drwxr-xr-x 4 iponly168797 iponly168797 4.0K Jan 27 13:58 .
drwxr-xr-x 4 iponly168797 iponly168797 4.0K Jan 27 13:57 ..
drwxr-xr-x 2 iponly168797 iponly168797 4.0K Jan 27 13:56 abc
-rw-r--r-- 1 iponly168797 iponly168797    0 Jan 27 13:56 test.3
drwxr-xr-x 2 iponly168797 iponly168797 4.0K Jan 27 13:56 xyz
-bash-3.2$ ls xyz
total 8.0K
drwxr-xr-x 2 iponly168797 iponly168797 4.0K Jan 27 13:56 .
drwxr-xr-x 4 iponly168797 iponly168797 4.0K Jan 27 13:58 ..
-rw-r--r-- 1 iponly168797 iponly168797    0 Jan 27 13:56 test.1
-rw-r--r-- 1 iponly168797 iponly168797    0 Jan 27 13:56 test.2

Other Answer1

--recursive-unlink removes files and directories only, if it keeps you from extracting the new files. For your purpose, add --unlink-first. With both options, the files and directories not in the new tar will be removed before.

This is how it works for me:

[email protected]:~/tmp/tar $ tar cvf a.tar a
a/
a/test.txt
a/b/
a/c/
a/a/
[email protected]:~/tmp/tar $ cd b
[email protected]:~/tmp/tar/b $ tar xvf ../a.tar
a/
a/test.txt
a/b/
a/c/
a/a/
[email protected]:~/tmp/tar/b $ cd ../a
[email protected]:~/tmp/tar/a $ rmdir b
[email protected]:~/tmp/tar/a $ rm test.txt
rm: reguläre leere Datei »test.txt« entfernen? y
[email protected]:~/tmp/tar/a $ cd ..
[email protected]:~/tmp/tar $ tar cvf a.tar a
a/
a/c/
a/a/
[email protected]:~/tmp/tar $ cd b
[email protected]:~/tmp/tar/b $ tar xvf ../a.tar --unlink-first --recursive-unlink
a/
a/c/
a/a/
[email protected]:~/tmp/tar/b $ ls -lR
.:
insgesamt 8
drwxr-xr-x 4 ott ott 4096 31. Okt 19:21 a

./a:
insgesamt 16
drwxr-xr-x 2 ott ott 4096 28. Okt 21:31 a
drwxr-xr-x 2 ott ott 4096 28. Okt 21:31 c

./a/a:
insgesamt 0

./a/c:
insgesamt 0
[email protected]:~/tmp/tar/b $ tar --version
tar (GNU tar) 1.20

Maybe you have a different tar?

comments:

I created a test/ subdirectory and also touched a test1.txt file in the main subdirectory and then used --recursive-unlink as well as --unlink-first and both the test/ directory AND the test1.txt file were STILL in that directory structure. So I don't think this answer is correct.
@TomAuger: I've updated my reply.
I'm using TAR(1).
My results seem to be different. tar -xzvf ../b/tar.tgz --unlink-first --recursive-unlink ./ did not remove the directories or files that were not in the tar. See my edited post for details.
What is your tar version? WHat's your OS version? My example is from KNOPPIX (Debian based).