[There are the usual details with this patch, such as documentation (cvs.texinfo and NEWS), a sanity.sh testcase, and coding standards (whitespace is not according to HACKING). But that is probably not as big a deal before the fundamental issues, like whether automatic detection is a good idea, are dealt with. The idea of having -kbauto enable automatic detection of binary files makes sense to me. Especially if the algorithms used are at all fallible, it may not be a good idea to make this the default. My favorite algorithm for detecting binaryness is "if a text checkin followed by a (text) checkout would produce a different (corrupted) file, then assume binary". This is elegant and easy to explain (and also probably can be implemented in portable C code). The downside is that it requires reading the whole file, not just the first 1K or 8K or some such. For further thoughts on algorithrms, see the comment at the "Binary howto" node of cvs.texinfo. -kingdon, Nov 1997] Date: Fri, 07 Nov 1997 16:35:24 +0100 From: Peter Brandstrom MIME-Version: 1.0 To: info-cvs@prep.ai.mit.edu Subject: Re: Binary mode should be default. Content-Type: multipart/mixed; boundary="------------5D3F40B11D3B" This is a multi-part message in MIME format. --------------5D3F40B11D3B Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Just for fun I hacked a binary file auto-detection mechanism into CVS 1.9.18. It should work like this: cvs add -kbauto myfile.txt mypic.gif or cvs import -kbauto [...] I borrowed the detection mechanism from SourceSafe: if the first kilobyte of the file contains any null bytes (ascii 0) it is considered binary, else text. This works rather well in my experience, but this thread has already mentioned several human readable file types which should not be merged by CVS. Modify the function "file_is_binary" if you think it needs improvement. It is not an elegant hack. If you run client/server CVS you need to upgrade the server too and both client and server perform auto detection. Perhaps this could be avoided. I have not tested it much, I'm afraid. Detection worked well with 40 GIF icons I tried it on. I wouldn't bother applying the patch to anything before 1.9.18 as there will probably be things missing. Happy hacking, Peter Brandstrom --------------5D3F40B11D3B Content-Type: text/plain; charset=us-ascii; name="cvs-patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="cvs-patch" Index: add.c =================================================================== RCS file: add.c,v retrieving revision 1.1 retrieving revision 1.2 diff -u -r1.1 -r1.2 --- add.c 1997/11/07 12:00:40 1.1 +++ add.c 1997/11/07 15:10:35 1.2 @@ -56,6 +56,7 @@ List *entries; Vers_TS *vers; struct saved_cwd cwd; + int auto_detect_binary = 0; if (argc == 1 || argc == -1) usage (add_usage); @@ -71,6 +72,11 @@ case 'k': if (options) free (options); + if (strcmp(optarg, "bauto") == 0) { + options = xstrdup("-kbauto"); + auto_detect_binary = 1; + break; + } options = RCS_check_kflag (optarg); break; @@ -217,7 +223,8 @@ } send_file_names (argc, argv, SEND_EXPAND_WILD); /* FIXME: should be able to pass SEND_NO_CONTENTS, I think. */ - send_files (argc, argv, 0, 0, SEND_BUILD_DIRS); + send_files (argc, argv, 0, 0, SEND_BUILD_DIRS + | (auto_detect_binary ? SEND_AUTO_BINARY : 0)); send_to_server ("add\012", 0); if (message) free (message); @@ -258,6 +265,11 @@ finfo.rcs = NULL; + if (auto_detect_binary && file_is_binary(finfo.file, finfo.fullname)) + options = xstrdup("-kb"); + else + options = NULL; + /* Find the repository associated with our current dir. */ repository = Name_Repository (NULL, finfo.update_dir); @@ -741,4 +753,38 @@ Register (entries, user, "0", line, options, tag, (char *) 0, (char *) 0); free (line); return (0); +} + + +#define PREVIEW_SIZE 1024 + +/* + * Returns true if file is binary. A file is deemed to be binary if + * there is a null byte in the first kilobyte. + */ +int +file_is_binary(file, short_pathname) + char *file; + char *short_pathname; +{ + char buf[PREVIEW_SIZE]; + int fd; + char *bufp = buf; + int len; + + fd = CVS_OPEN(file, O_RDONLY); + if (fd < 0) + error (1, errno, "reading %s", file); + + /* Read the first kilobyte */ + while ((len = read (fd, bufp, (buf + PREVIEW_SIZE) - bufp)) > 0) + bufp += len; + + close(fd); + + if (len < 0) + error (1, errno, "reading %s", short_pathname); + + len = bufp - buf; + return memchr(buf, 0, len) != NULL; } Index: client.c =================================================================== RCS file: client.c,v retrieving revision 1.1 retrieving revision 1.2 diff -u -r1.1 -r1.2 --- client.c 1997/11/07 12:00:41 1.1 +++ client.c 1997/11/07 15:10:36 1.2 @@ -4506,6 +4506,7 @@ int build_dirs; int force; int no_contents; + int auto_binary; }; static int send_fileproc PROTO ((void *callerdat, struct file_info *finfo)); @@ -4581,6 +4582,14 @@ || args->force || strcmp (vers->ts_user, vers->ts_rcs) != 0) { + /* If no previous file (adding) and automatic binary detection + * simulate -kb + */ + + if (vers->vn_user == NULL && args->auto_binary + && file_is_binary(filename, finfo->fullname)) + vers->options = xstrdup("-kb"); + if (args->no_contents && supported_request ("Is-modified")) { @@ -4922,6 +4931,7 @@ args.build_dirs = flags & SEND_BUILD_DIRS; args.force = flags & SEND_FORCE; args.no_contents = flags & SEND_NO_CONTENTS; + args.auto_binary = flags & SEND_AUTO_BINARY; err = start_recursion (send_fileproc, send_filesdoneproc, send_dirent_proc, (DIRLEAVEPROC)NULL, (void *) &args, @@ -4953,14 +4963,14 @@ */ int client_process_import_file (message, vfile, vtag, targc, targv, repository, - all_files_binary) + check_binary) char *message; char *vfile; char *vtag; int targc; char *targv[]; char *repository; - int all_files_binary; + int check_binary; { char *update_dir; char *fullname; @@ -4989,13 +4999,15 @@ } send_a_repository ("", repository, update_dir); - if (all_files_binary) - { - vers.options = xmalloc (4); /* strlen("-kb") + 1 */ - strcpy (vers.options, "-kb"); - } - else - { + switch (check_binary) { + case ALL_FILES_BINARY: + vers.options = xstrdup("-kb"); + break; + case AUTO_DETECT_BINARY: + if (file_is_binary(vfile, fullname)) + vers.options = xstrdup("-kb"); + break; + default: vers.options = wrap_rcsoption (vfile, 1); } send_modified (vfile, fullname, &vers); Index: client.h =================================================================== RCS file: client.h,v retrieving revision 1.1 retrieving revision 1.2 diff -u -r1.1 -r1.2 --- client.h 1997/11/07 12:00:42 1.1 +++ client.h 1997/11/07 15:10:38 1.2 @@ -90,6 +90,12 @@ #define SEND_FORCE 2 #define SEND_NO_CONTENTS 4 +#define SEND_AUTO_BINARY 32768 + +/* Flags for binary auto detection */ +#define ALL_FILES_BINARY 1 +#define AUTO_DETECT_BINARY 2 + /* Send an argument to the remote server. */ void send_arg PROTO((char *string)); @@ -171,7 +177,7 @@ extern void client_import_setup PROTO((char *repository)); extern int client_process_import_file PROTO((char *message, char *vfile, char *vtag, - int targc, char *targv[], char *repository, int all_files_binary)); + int targc, char *targv[], char *repository, int check_binary)); extern void client_import_done PROTO((void)); extern void client_notify PROTO((char *, char *, char *, int, char *)); #endif /* CLIENT_SUPPORT */ Index: cvs.h =================================================================== RCS file: cvs.h,v retrieving revision 1.1 retrieving revision 1.2 diff -u -r1.1 -r1.2 --- cvs.h 1997/11/07 12:00:43 1.1 +++ cvs.h 1997/11/07 15:10:39 1.2 @@ -862,3 +862,6 @@ #if defined(SERVER_SUPPORT) || defined(CLIENT_SUPPORT) #include "server.h" #endif + + +int file_is_binary PROTO((char *file, char *short_pathname)); --------------5D3F40B11D3B--