Date: Sat, 23 Mar 2002 22:30:12 -0600 From: Vincent Stemen <bu@hightek.org>
Hi everybody.
This is just a curtesy note to let you know I finally set up a mailing list for bu. I think I got everybody on it that has emailed me about bu expressing significant interest in the project. Most have asked to be put on it once I finally set one up. I will post new release announcements to this list as well as answer questions. Hopefully, this will allow everybody to help each other if I get busy and am slow about responding :-).
It is a simple list that does not require one to be subscribed to post to it yet. Only to received messages from it. Everybody that receives this message is already subscribed. I will eventually get around to setting up real mailing list software such as majordomo.
Please accept my apologies if you do not want on the list. Send me an un-subscribe note and I will remove you. Otherwise, please use the list for future questions/bug reports so that everybody may benefit from it unless you really feal something merits private communications.
Regards, Vincent
Date: Wed, 27 Mar 2002 02:08:36 -0600 From: Vincent Stemen <bu@hightek.org>
On Sat, Mar 23, 2002 at 01:20:22PM +0100, Alain Cochard wrote: > PS: Still dying for version 3!!
The CDRW dump capability is taking longer than I expected. First I had to re-write a bunch of the code because it turned out that GNU tar is buggy when it comes to restoring from pipes. I spoke with the maintainer about it and he does not have the time right now to fix it because of higher priority bugs. Tar does not do multi-volume with compression so I am writing more sophisticated code to handle it. I switched to other techniques to create the archive which involves forking a process into the background to find the files and stream them to another process that simultaneously generates the archives, breaking up the volumes as needed. I want high performance without having to wait on one stage to complete before the next one starts. I ran into some technical difficulties (do to a small gap in my knowledge in perl that nailed me) getting the streams to work the way I want them to. I think I now have it mostly resolved so it should not be much longer. It turned out that the new technique actually is better anyway because I came up with a way to vary the amount of data that goes on each volume depending on the compression ratio rather than putting a fixed amount of pre-compressed data on each volume.
Date: Wed, 27 Mar 2002 09:47:50 +0100 From: "Gabor Z. Papp" <gzp@myhost...>
* Vincent Stemen <bu@hightek.org>:
| On Sat, Mar 23, 2002 at 01:20:22PM +0100, Alain Cochard wrote: | > PS: Still dying for version 3!! | | The CDRW dump capability is taking longer than I expected. First I | had to re-write a bunch of the code because it turned out that GNU tar | is buggy when it comes to restoring from pipes. I spoke with the | maintainer about it and he does not have the time right now to fix it
Try with http://www.fokus.gmd.de/research/cc/glone/employees/joerg.schilling/private/man/star.html
Date: Fri, 17 May 2002 12:40:11 -0500 From: Vincent Stemen <bu@hightek.org>
On Fri, May 17, 2002 at 10:32:58AM +0200, Dirk Schouten wrote: > Hello Vincent, > > Your program is working fine at our primary school. > We noticed one thing that's a bit unfortunate for us. > As far as I can see B U does not copy empty directories. > We have a system in which every group in school is just one user in /home. > Every pupil in a group has a directory of her/his own. When directories > remain empty and are not copied, we have to do extra work in case of a > crash, i.e. finding out which directories are missing and recreating them. > > Is this a suggestion for improvement? Or a bug report?? Or ...? >
Yes, I am aware of this. I have received at least one other email about it. It is on the to do list to be fixed and should be fixed sometime in the next few beta releases. It is an architectural problem with the way bu uses "cp". I plan to re-write part of the code in perl in a way which will fix this and, I expect, will dramatically improve the performance.
The new CDRW dump features in the version about to be released dumps empty directories if that is helpful to you in the meantime.
Vincent
> Have a nice weekend, > Dirk > _________________________________________ > Dirk Schouten > > Media Action Projects > http://utopia.knoware.nl/users/schoutdi > Take a peek at the NEWS and REVIEWS sections!
Date: Sat, 18 May 2002 21:07:50 -0500 From: Vincent Stemen <bu@hightek.org>
On Fri, May 17, 2002 at 10:57:41PM +0200, Dirk Schouten wrote: > Hi Vincent,
> A few comments from a computer and UNIX newbie, looking to B U from his own > perspective: > > >Yes, I am aware of this. I have received at least one other email > >about it. It is on the to do list to be fixed and should be fixed > >sometime in the next few beta releases. > As you can imagine, I look forward to it! > > >It is an architectural > >problem with the way bu uses "cp". I plan to re-write part of the > >code in perl in a way which will fix this and, I expect, will > >dramatically improve the performance.
> One of the things I like about B U is that it's not in Perl. > So it runs on any Linux/UNIX.
All Linux and BSD systems I have seen are distributed with Perl now days, and I have not heard of any other Unix platforms that Perl cannot run on, so I do not expect this to be an issue. The existing usage and behavior of the NFS backup features of bu will not change if I re-write any of it in perl, so you should not notice any difference other than performance.
I wrote all the new CDRW code in Perl because a lot of the code was just to complex to try to implement in shell script.
> And the performance improvement?? Better copies? Faster? Here the program > works at night, so we do not need performance improvement.
It should be considerably faster, for one, because it will be copying the files as they are being found. Currently, bu runs "find" on each specified directory and generates a list of every file to be backed up and filters it through the Exclude list before it starts copying any files from that directory. Not only that, in the new CDRW Perl code, I exclude entire directories from being traversed during the find process if they are in the Exclude list. With the existing shell script code for FS backups using find, directories that are excluded still get traversed by find and every file is screened. This is far more overhead.
I have already written the Perl code to do most of this for the CDRW features and the performance is pretty good. Also, for NFS backups, bu currently runs "cp" on every file. If it turns out not to be too difficult to retain all the file attributes, I also may do the copying directly with Perl which should be faster than running cp on every file. Another advantage is that it would also eliminate the need to have GNU fileutils installed on BSD.
> > >The new CDRW dump features in the version about to be released do dump > >empty directories if that is helpful to you in the meantime.
> To be honest, this does not help us. On the contrary! > We like a program that just backs up. No untarring, no endless searches for > files, just what you write in your white paper. The strength of the program > is it's simplicity. It backs up, nothing more, nothing less. If a directory > is empty, it's the problem (or the pleasure) of the user. I hope you can > understand this.
Absolutely. Don't worry, I have no intention of removing any of the existing features. The CDRW dump features are in addition because we needed to be able to periodically dump our backup filesystem to removable media to keep off site and we could find no other suitable tools to do this. Especially none that run on both FreeBSD and Linux. They are meant to compliment bu's existing features. You can continue running bu as you have been without ever using the new features.
> Would it be an easy adjustment in the program to have it backup > _everything_. If not, we live with it and ask the pupils to put anything in > their directories. >
I did not originally think it would be easy to include empty directories with the existing code without re-writing part of the backup() routine in Perl. However, I was looking at the code again today and I think I came up with a way to add this ability with the existing shell script. I will experiment with it as soon as I get the chance. At least that would fix the immediate problem whether the speed improves or not, and I can go back later on to improve it with Perl.
Vincent
Date: Sun, 19 May 2002 14:57:20 +0200 From: Dirk Schouten <schoutdi@knoware...>
Hi Vincent, Thanks for your quick reply.
<snip> >> Would it be an easy adjustment in the program to have it backup >> _everything_. If not, we live with it and ask the pupils to put anything in >> their directories. >> > >I did not originally think it would be easy to include empty >directories with the existing code without re-writing part of the >backup() routine in Perl. However, I was looking at the code again >today and I think I came up with a way to add this ability with the >existing shell script. I will experiment with it as soon as I get the >chance. At least that would fix the immediate problem whether the >speed improves or not, and I can go back later on to improve it with >Perl.
As you can imagine, we appreciate your idea (when is it ready...:-)). And/but, there is more to it.
We now use your free program. In our school I strongly promote the idea that, although we use as much 'free software' as possible, this certainly does not mean that, to reverse quoting one of the GNU/Linux guru's, we do not want to pay or to contribute to free software. In our opinion it is a necessary way to keep developers developing, to get improvements, and in the end to ensure the continuated existence of free software. We have a limited budget (being a normal primary school in Holland) and are quite willing to pay for the improvement we ask for. Maybe we should further discuss this topic off-list or maybe it is a very good idea to keep it public, I do not know. Looking forward to hearing from you,
Hope you enjoy your weekend, Dirk
_________________________________________ Dirk Schouten
Media Action Projects http://utopia.knoware.nl/users/schoutdi Take a peek at the NEWS and REVIEWS sections!
Date: Mon, 20 May 2002 21:14:07 -0500 From: Vincent Stemen <bu@hightek.org>
I have finally released bu-3.0-beta. It has a few new FS backup features but most of the new features are related to CDRW dumps. I have also finally put up a real web page for bu along with a lot of new documentation describing the new features as well as most of the existing features that I had not yet documented before.
This was a larger undertaking that I had expected to have the capabilities I wanted. There are over 2200 lines of new and changed code. The change log and feature list show most of the details.
The CDRW features only work on FreeBSD until I finish the Linux portion of the code. If any of you are running FreeBSD, I would appreciate feedback on the CDRW dump features. Linux users can still upgrade and use it as they have been.
Date: Mon, 27 May 2002 14:21:22 -0500 From: Vincent Stemen <bu@hightek.org>
On Mon, May 27, 2002 at 01:59:11PM +0200, Dirk Schouten wrote: > Hi Vincent, > Maybe I'm a bit late with my question, but here it is anyhow. > Will our improvement be incorporated as an option in bu's configurartion file? > Something like: > > copy empty directories = yes/no > > Bye for now, > dirk
Hi Dirk.
I was not originally planning to. Do you need the ability to turn off the copying of empty directories? If so, how would that be an advantage in your situation? It would be possible to add if necessary with a little more coding. I am moving this to the mailing list in case anybody else has any opinions to express about it.
Also, to give you an update: The code changes are taking considerably more work than I had anticipated. I have made good progress but will likely not be done for a few more days. I discovered another weakness while testing that I am also fixing while I am at it. I will write documentation to fully describe it but, briefly, there were situations where you would not get the correct permissions and ownerships on directories in the backup filesystem. Since I want bu to mirror filesystems as closely as possible, I am fixing this while I am at it. It is actually a problem with all existing Unix tools where it comes to bu's ability to arbitrarily back up any file or directory whether the parent directories have been backed up already or not.
Example:
If you type "bu /home/joe/projects/foo" and /home/joe/projects did not already exist on the backup filesystem, the parent directories, /home/joe/projects get created with default ownership and permissions not matching those on the original filesystem. This happens even if you use "cp -a", cpio, or tar.
Vincent
Date: Thu, 30 May 2002 04:00:02 -0500 From: Vincent Stemen <bu@hightek.org>
On Mon, May 27, 2002 at 11:15:41PM +0200, Dirk Schouten wrote: > Hi Vincent, > Some replies from a UNIX analphabetic and a computer ignorant: > > >I was not originally planning to. Do you need the ability to turn off > >the copying of empty directories? If so, how would that be an > >advantage in your situation? > Vincent, I do not need turning off empty directories. I was only wondering > how a next version of B U would look like. > Would it be version x with "no backup of empty directories" and > version x1 with "the ability to backup empty dirs", > or would it be version xn with the choise of yes/no backup of empty dirs. > Something like the possibility to do full or incremental backups as it is > now in B U.
Hi Dirk.
The next beta release will automatically backup empty directories in order to duplicate the original filesystems as closely as possible. I do not currently plan to make it an option to turn it off unless I or somebody else comes up with a significant need or reason to do so that I have not yet thought of.
> For your information (and to reveal a bit on my 'simple of mindness'): > I do a _full_ backup every night. I do not want to remember what's the full > backup and what's the incremental. I am only a simple network administrator > on a simple primary school. I feel safe with a full backup. > I tell you this to inform you on on the situation of one of your users.
I have been doing all incremental backups with bu for several years. I am curious why you do a _full_ backup every night (assuming you mean you are turning incremental off). I'm not sure if you are fully aware of how bu works but incremental mode copies any file that is newer or does not exist on the backup filesystem. So, the first backup will be a full one by default since no files have been copied yet. The only situation that should cause the end result of doing incremental vs full to be any different is if the time setting is significantly off between the two machines causing the file modification times to be wrong on one machine or the other. In that case you can do a full backup once after reseting the time then go back to incremental.
One of the ways I use bu the most is to stay backed up throughout the day. I keep it in incremental mode and sometimes run it from the command line as often as every few minutes. I will do some work that I don't want to loose and type "bu ." from my work directory and have the warm fuzzy feeling of seeing the specific files I have changed get copied to our NFS server without having to wait for it to copy files unnecessarily.
Of course, I still need to add a synchronization feature to delete old files from the backup filesystem that no longer exist on the source filesystem. A couple of people have requested that but I have not been able to commit the time to do it yet. In fact, it would probably even be nice for it to be smart enough to correct any changes in file ownerships/permissions while it is at it. Until I do, the backup filesystem gradually gets dirtier and fuller than the source filesystem over time and eventually needs to be wiped and started over.
If anybody needs this bad enough to contribute financially to the cause, I could also probably take time off to get that done sooner as well.
Vincent
Date: Mon, 27 May 2002 23:15:41 +0200 From: Dirk Schouten <schoutdi@knoware...>
Hi Vincent, Some replies from a UNIX analphabetic and a computer ignorant:
>I was not originally planning to. Do you need the ability to turn off >the copying of empty directories? If so, how would that be an >advantage in your situation? Vincent, I do not need turning off empty directories. I was only wondering how a next version of B U would look like. Would it be version x with "no backup of empty directories" and version x1 with "the ability to backup empty dirs", or would it be version xn with the choise of yes/no backup of empty dirs. Something like the possibility to do full or incremental backups as it is now in B U. For your information (and to reveal a bit on my 'simple of mindness'): I do a _full_ backup every night. I do not want to remember what's the full backup and what's the incremental. I am only a simple network administrator on a simple primary school. I feel safe with a full backup. I tell you this to inform you on on the situation of one of your users.
>Also, to give you an update: The code changes are taking considerably >more work than I had anticipated. I have made good progress but will >likely not be done for a few more days. We do not want you to have it ready 'tomorrow'. (But this does not mean we want _too_ much delay)
> I discovered another weakness >while testing that I am also fixing while I am at it. I will write >documentation to fully describe it but, briefly, there were situations >where you would not get the correct permissions and ownerships on >directories in the backup filesystem. Since I want bu to mirror >filesystems as closely as possible, I am fixing this while I am at it. This I can grasp (understand) and I am glad you fix it, although I have not yet noticed the problem.
>If you type "bu /home/joe/projects/foo" and /home/joe/projects did >not already exist on the backup filesystem, the parent directories, >/home/joe/projects get created with default ownership and >permissions not matching those on the original filesystem. This >happens even if you use "cp -a", cpio, or tar. Ahh, I am so glad you are the expert!! I do not understand what to say. Keep on working and we (all the users of BU 'next version') will end up with something that's an improvement.
To give you a small update on what I have been doing the last days: Arguiing in a Dutch newsgroup on education and new media about educational institutions paying for 'free' software. In Holland there are a lot of schools that use 'free software' in the sense of 'free beer'. What I say is that we have money enough to pay developers and that we shoud, with our contributions, keep free software free. The last couple of hours, since this remark the topic has remained _very_ silent.... Bah, sometimes I despise my country-fellows and deeply understand what 'Dutch treat' means.
We keep in touch, Dirk
_________________________________________ Dirk Schouten
Media Action Projects http://utopia.knoware.nl/users/schoutdi Take a peek at the NEWS and REVIEWS sections!
Date: Thu, 30 May 2002 11:41:28 +0200 From: Dirk Schouten <schoutdi@knoware...>
Hi Vincent, LOL on your 'fuzzy warm feeling'! Thank you for your extensive reply on full and incremental.
>I am curious why you do a _full_ backup every night (assuming you mean >you are turning incremental off). That's correct. > I'm not sure if you are fully aware >of how bu works but incremental mode copies any file that is newer or >does not exist on the backup filesystem. So, the first backup will be >a full one by default since no files have been copied yet. As I understand it. Even when incremental = on , the first time you do a backup B U makes a full backup.
>The only >situation that should cause the end result of doing incremental vs >full to be any different is if the time setting is significantly off >between the two machines causing the file modification times to be >wrong on one machine or the other. In that case you can do a full >backup once after reseting the time then go back to incremental. I think I understand. Maybe I should tell you: I make a backup every night on a secnd HD in the same computer. I have 5 directories: /monday, /tuesday ... /friday. With cron at 04.00 am a backup is made of a couple of dirs. Once a week I shut off the sambaserver, insert a HD in a tray, switch on the server and make a backup which I store on another place, outside the school. Maybe there are better, more elegant, less time consuming possiblilities. It sure gives this newbie a warm fuzzy feeling :).
> >Of course, I still need to add a synchronization feature to delete old >files from the backup filesystem that no longer exist on the source >filesystem. A couple of people have requested that but I have not >been able to commit the time to do it yet. In fact, it would probably >even be nice for it to be smart enough to correct any changes in file >ownerships/permissions while it is at it. Until I do, the backup >filesystem gradually gets dirtier and fuller than the source >filesystem over time and eventually needs to be wiped and started >over. > >If anybody needs this bad enough to contribute financially to the >cause, I could also probably take time off to get that done sooner as >well. Since this is the mailing list it is perhaps an idea to ask you how much such a significant improvement would cost. The readers could then say how much they individually would like to contribute. Ah, how I love the implications of the new economy!
Kind regards, dirk
_________________________________________ Dirk Schouten
Media Action Projects http://utopia.knoware.nl/users/schoutdi Take a peek at the NEWS and REVIEWS sections!
Date: Fri, 31 May 2002 16:59:58 -0500 From: Vincent Stemen <bu@hightek.org>
On Fri, May 31, 2002 at 03:03:09PM +0200, Afd. ICT Rosaboekdrukker wrote: > Hi Vincent, > Doing a full backup and occasionally looking at the screen, seeing _all_ the > files being copied passing by gives me a warmer, fuzzier feeling than an > incremental backup. With incremental I only see a couple of files. Is this > really trustworty?? :)).
Hi Dirk.
It has been trustworthy for me. On the other side of the coin, I have caught mistakes many times using incremental because I expected only a few files to be copied and when a lot of files started going by, discovered I specified a wrong directory or was using the wrong backup filesystem. Also, if I cannot remember if I made any more changes since I last backed up, and bu tells me "No new or changed files to back up", it quickly confirms to me that I am fully backed up. This is the way I have used bu for years.
Incremental mode is really nice for running from the command line on specific directories or files all the time throughout the day, even if you feel more comfortable using full mode for your nightly backups. In fact, this method of usage is one of the primary reasons I originally wrote bu. Up to 8 or 10 hours of work can be lost in a crash before the scheduled nightly backup and no other tools I found were designed to be used in this way.
However, that is why I made bu so configurable, so it fits the needs and preferences of everybody :-).
Have a good weekend. Vincent
Date: Fri, 31 May 2002 23:07:14 -0500 From: Vincent Stemen <bu@hightek.org>
On Sat, Mar 23, 2002 at 01:18:29PM -0600, Vincent Stemen wrote: > On Sat, Mar 23, 2002 at 12:14:55PM +0100, Alain Cochard wrote: > > Hello. Me again. > > > > I had a file, say 'foofoo'. After having changed the name of this > > file, I created a directory also named 'foofoo' (to which I moved the > > initial 'foofoo' file, together with other new files). > > > > At the next incremental backup, bu smartly notified me that: > > > > cp: `/backup/home/me/foofoo' exists but is not a directory > > > > The problem is that the above warning (or an equivalent one) did not > > appear in the log file. Only because I was standing by chance in > > front of the computer that I noticed it. > > > > Is it the expected behavior? If yes, I would be tempted to say that > > it is not fortunate. > > > > I'm using version 2.8. > > > > Best regards, > > Alain > > > > Hi Alian. > That's a good point. The stderr of cp needs to go in the log file as > well. I added it to my todo list. > > Thanks for the input. > > Vincent >
This was not originally posted on the list but I am following up here in case others have encountered it.
This problem is now fixed in the about to be released 3.1 beta. The backup() routine has been completely re-written in Perl and no longer uses "cp". It uses a tar stream and utilizes part of the code I wrote for the CDRW dumps. Situations similar to the one described above no longer generate an error. It corrects the difference. If a file type has changed between backups, for example from a directory to a regular file, or vice versa, bu will remove the file or directory from the backup filesystem first, so there should be no more errors of this nature. It should always correct the backup filesystem to match the original filesystem.
When there are errors, there is also now an error log, $LOG.errors. Although, not all errors are logged in it yet, but some are.
Vincent