[This patch is relative to CVS 1.8 or so. See http://www.cyclic.com/cvs/dev-metadata.html or TODO #194 for further thoughts on the general issue of separating metadata. One specific idea for starting with this patch and moving towards more a general separated metadata implementation is to version control the attributes so that the version number of the attributes serves as a change ID (this means tagging would not need to even read the attributes). I'm not really sure whether _watchers and so on should be version controlled, though. The code needs locks in import.c to control access to the fileattr file. The idea below, about separating tags in addition, has merit (see TODO #194 for more on the general concept of separating metadata). This wants to be configurable for the sake of tools which directly access the repository, such as cvsweb and CVSup. It might be nice to write a "cvs export-rcs" command so that people can still get RCS files including the tags (e.g. for transferring their history to another version control system). This patch, too, needs to be configurable. If some users are using the _head and _branch attributes and others are not, then some of the users will get stale _head and _branch data and files will mysteriously fail to update (at least, I would think that would be the symptom). -kingdon] Date: Mon, 17 Jun 1996 16:52:59 -0400 From: Ian Lance Taylor To: bug-cvs@prep.ai.mit.edu Subject: Use attributes to avoid reading the RCS file This patch uses file attributes to avoid reading the RCS file in the common case. This speeds up common cases of update by some 30% to 40% on entirely local disk. I would expect somewhat larger speedups when using an NFS mounted repository. The basic idea is from Jim Kingdon. I simply store the RCS file head and branch values as attributes _head and _branch. If the file is in the attic, I also store an _attic attribute. This information is all that RCS_parse needs. In the common case of running cvs update when the files are on the trunk and most are up to date, CVS does not have to open the repository file at all. This change does mean that file attributes will always be present in the repository, which means that if somebody uses CVS 1.9 on a repository it will no longer be possible to use versions of CVS prior to 1.7 on that repository. On the other hand, since the file attributes merely duplicate information that is in the RCS file, it is safe to simply delete all the attribute information. I am considering writing a patch to store tag information in another file in the CVS directory in the repository. My thought is that the tag information would be stored only in this file, and not in the RCS file at all. This would make it possible to run CVS tag without rewriting every file in the repository, which I believe would be a significant time speedup. Does anybody have any opinions about this? Ian Mon Jun 17 15:11:09 1996 Ian Lance Taylor * fileattr.h: Document new attributes. * rcs.c: Include fileattr.h. (RCS_parse_with_attrs): New function. (RCS_reparsercsfile): Double check head and branch fields. * rcs.h (RCS_parse_with_attrs): Declare. * recurse.c (do_file_proc): Call RCS_parse_with_attrs rather than RCS_parse. * import.c: Inclue fileattr.h. (import_descend): Call fileattr_startdir, fileattr_write, and fileattr_free. (update_rcs_file): Set _head attribute. (add_rcs_file): Set _head and _branch attributes. * commit.c (remove_file): Clear _branch attribute if resetting to default branch. Set _head and _attic attributes if head is removed. (fixbranch): Set _branch attribute. (checkaddfile): Clear _attic attribute if file is moved out of attic. Set _attic attribute if new file is created in attic. (lock_RCS): Clear _branch attribute if resetting to default branch. * checkin.c (Checkin): Set _head attribute with new value. * sanity.sh (devcom): Comment out devcom-b2 test, since attributes now always exist. cvs server: Diffing . Index: checkin.c =================================================================== RCS file: /cvs/cvsfiles/devo/cvs/src/checkin.c,v retrieving revision 1.21 diff -u -r1.21 checkin.c --- checkin.c 1996/06/15 05:16:26 1.21 +++ checkin.c 1996/06/17 20:26:24 @@ -124,6 +124,7 @@ Register (entries, file, vers->vn_rcs, vers->ts_user, vers->options, vers->tag, vers->date, (char *) 0); history_write (type, (char *) 0, vers->vn_rcs, file, repository); + fileattr_set (file, "_head", vers->srcfile->head); if (tocvsPath) if (unlink_file_dir (tocvsPath) < 0) Index: commit.c =================================================================== RCS file: /cvs/cvsfiles/devo/cvs/src/commit.c,v retrieving revision 1.64 diff -u -r1.64 commit.c --- commit.c 1996/06/15 05:16:26 1.64 +++ commit.c 1996/06/17 20:27:06 @@ -1431,6 +1431,7 @@ rcs); return (1); } + fileattr_set (file, "_branch", NULL); } #ifdef SERVER_SUPPORT @@ -1494,6 +1495,20 @@ return (1); } free(tmp); + + { + char *new_rev, *cp; + int v; + + new_rev = (char *) xmalloc (strlen (rcsnode->head) + 2); + strcpy (new_rev, rcsnode->head); + cp = strrchr (new_rev, '.'); + v = atoi (cp + 1); + sprintf (cp + 1, "%d", v + 1); + fileattr_set (file, "_head", new_rev); + free (new_rev); + fileattr_set (file, "_attic", ""); + } } /* Print message that file was removed. */ @@ -1598,6 +1613,7 @@ if ((retcode = RCS_setbranch (rcs, branch)) != 0) error (retcode == -1 ? 1 : 0, retcode == -1 ? errno : 0, "cannot restore branch to %s for %s", branch, rcs); + fileattr_set (file, "_branch", branch); } } @@ -1661,6 +1677,8 @@ file); return (1); } + + fileattr_set (file, "_attic", NULL); } if ((rcsfile = *rcsnode) == NULL) @@ -1727,6 +1745,9 @@ return (1); } + /* We don't need to set any file attributes here, because we + will wind up calling Checkin, which will handle them. */ + /* put the new file back where it was */ rename_file (fname, file); @@ -1793,6 +1814,9 @@ fileattr_newfile (file); + if (tag && newfile) + fileattr_set (file, "_attic", ""); + fix_rcs_modes (rcs, file); return (0); } @@ -1863,6 +1887,7 @@ free (branch); return (1); } + fileattr_set (user, "_branch", NULL); } err = RCS_lock(rcs, NULL, 0); } Index: fileattr.h =================================================================== RCS file: /cvs/cvsfiles/devo/cvs/src/fileattr.h,v retrieving revision 1.2 diff -u -r1.2 fileattr.h --- fileattr.h 1996/06/17 19:16:16 1.2 +++ fileattr.h 1996/06/17 20:27:09 @@ -50,7 +50,16 @@ EDITOR > VAL { , EDITOR > VAL } where EDITOR is a username, and VAL is TIME+HOSTNAME+PATHNAME, where TIME is when the "cvs edit" command happened, - and HOSTNAME and PATHNAME are for the working directory. */ + and HOSTNAME and PATHNAME are for the working directory. + + _head: Head revision of file. This is a copy of the value of the head + keyword in the RCS file. + + _branch: Default branch of file. This is a copy of the value of the + branch keyword in the RCS file. + + _attic: If the attribute is present, the RCS file is in the Attic. + The value is unimportant. */ #define CVSREP_FILEATTR "CVS/fileattr" Index: import.c =================================================================== RCS file: /cvs/cvsfiles/devo/cvs/src/import.c,v retrieving revision 1.37 diff -u -r1.37 import.c --- import.c 1996/06/15 05:16:29 1.37 +++ import.c 1996/06/17 20:27:51 @@ -17,6 +17,7 @@ */ #include "cvs.h" +#include "fileattr.h" #include "savecwd.h" #define FILE_HOLDER ".#cvsxxx" @@ -335,6 +336,8 @@ int err = 0; List *dirlist = NULL; + fileattr_startdir (repository); + /* first, load up any per-directory ignore lists */ ign_add_file (CVSDOTIGNORE, 1); wrap_add_file (CVSDOTWRAPPER, 1); @@ -406,6 +409,9 @@ (void) closedir (dirp); } + fileattr_write (); + fileattr_free (); + if (dirlist != NULL) { Node *head, *p; @@ -569,6 +575,8 @@ letter = 'U'; add_log (letter, vfile); + fileattr_set (vfile, "_head", vers->srcfile->head); + freevers_ts (&vers); return (0); } @@ -1036,6 +1044,10 @@ if (tocvsPath) if (unlink_file_dir (tocvsPath) < 0) error (0, errno, "cannot remove %s", tocvsPath); + + fileattr_set (user, "_head", vhead); + fileattr_set (user, "_branch", vbranch); + return (err); write_error: Index: rcs.c =================================================================== RCS file: /cvs/cvsfiles/devo/cvs/src/rcs.c,v retrieving revision 1.45 diff -u -r1.45 rcs.c --- rcs.c 1996/06/15 05:16:31 1.45 +++ rcs.c 1996/06/17 20:30:43 @@ -10,6 +10,7 @@ #include #include "cvs.h" +#include "fileattr.h" static RCSNode *RCS_parsercsfile_i PROTO((FILE * fp, const char *rcsfile)); static char *RCS_getdatebranch PROTO((RCSNode * rcs, char *date, char *branch)); @@ -107,6 +108,65 @@ } /* + * Parse an RCS file if we know that we can fetch file attributes. This + * skips reading the file if the attributes are available. + */ + +RCSNode * +RCS_parse_with_attrs (file, repos) + const char *file; + const char *repos; +{ + char *head; + RCSNode *rcs; + + head = fileattr_get0 (file, "_head"); + if (head != NULL) + { + char rcsfile[PATH_MAX]; + + /* We have head and branch attributes for this file, so we + don't need to actually open it. */ + rcs = (RCSNode *) xmalloc (sizeof (RCSNode)); + memset ((char *) rcs, 0, sizeof (RCSNode)); + rcs->refcount = 1; + + if (fileattr_get (file, "_attic") == NULL) + (void) sprintf (rcsfile, "%s/%s%s", repos, file, RCSEXT); + else + { + (void) sprintf (rcsfile, "%s/%s/%s%s", repos, CVSATTIC, file, + RCSEXT); + rcs->flags |= INATTIC; + } + rcs->path = xstrdup (rcsfile); + + rcs->head = head; + rcs->branch = fileattr_get0 (file, "_branch"); + + rcs->flags |= PARTIAL | VALID; + + return rcs; + } + + /* For backward compatibility for repositories which don't have + head and branch attributes stored yet, just call RCS_parse, and + then store the attributes. */ + rcs = RCS_parse (file, repos); + + if (rcs != NULL) + { + fileattr_set (file, "_head", rcs->head); + if (rcs->branch != NULL) + fileattr_set (file, "_branch", rcs->branch); + if (rcs->flags & INATTIC) + fileattr_set (file, "_attic", ""); + } + + return rcs; +} + +/* * Parse a specific rcsfile. */ RCSNode * @@ -268,6 +328,22 @@ continue; } + if (strcmp (RCSHEAD, key) == 0 + && strcmp (value, rdata->head) != 0) + { + error (1, 0, "head attribute does not match file for `%s'", + rcsfile); + } + + if (strcmp (RCSBRANCH, key) == 0 + && (value == NULL + ? rdata->branch != NULL + : rdata->branch == NULL || strcmp (value, rdata->branch) != 0)) + { + error (1, 0, "branch attribute does not match file for `%s'", + rcsfile); + } + /* * check key for '.''s and digits (probably a rev) if it is a * revision, we are done with the headers and are down to the Index: rcs.h =================================================================== RCS file: /cvs/cvsfiles/devo/cvs/src/rcs.h,v retrieving revision 1.19 diff -u -r1.19 rcs.h --- rcs.h 1996/06/15 05:16:32 1.19 +++ rcs.h 1996/06/17 20:30:44 @@ -82,6 +82,7 @@ * exported interfaces */ RCSNode *RCS_parse PROTO((const char *file, const char *repos)); +RCSNode *RCS_parse_with_attrs PROTO((const char *file, const char *repos)); RCSNode *RCS_parsercsfile PROTO((char *rcsfile)); char *RCS_check_kflag PROTO((const char *arg)); char *RCS_getdate PROTO((RCSNode * rcs, char *date, int force_tag_match)); Index: recurse.c =================================================================== RCS file: /cvs/cvsfiles/devo/cvs/src/recurse.c,v retrieving revision 1.30 diff -u -r1.30 recurse.c --- recurse.c 1996/06/15 03:58:43 1.30 +++ recurse.c 1996/06/17 20:32:16 @@ -513,7 +513,7 @@ strcat (finfo->fullname, finfo->file); if (dosrcs && repository) - finfo->rcs = RCS_parse (finfo->file, repository); + finfo->rcs = RCS_parse_with_attrs (finfo->file, repository); else finfo->rcs = (RCSNode *) NULL; ret = fileproc (finfo); Index: sanity.sh =================================================================== RCS file: /cvs/cvsfiles/devo/cvs/src/sanity.sh,v retrieving revision 1.43 diff -u -r1.43 sanity.sh --- sanity.sh 1996/06/15 04:38:02 1.43 +++ sanity.sh 1996/06/17 20:33:11 @@ -2993,7 +2993,9 @@ dotest devcom-b0 "${testcvs} watch off" '' dotest devcom-b1 "${testcvs} watch remove" '' # Test that CVS 1.6 and earlier can handle the repository. - dotest_fail devcom-b2 "test -d ${CVSROOT_DIRNAME}/first-dir/CVS" + # This test no longer works, because of the new _head, .etc, + # attributes. + # dotest_fail devcom-b2 "test -d ${CVSROOT_DIRNAME}/first-dir/CVS" cd ../.. rm -rf 1 2 3 ${CVSROOT_DIRNAME}/first-dir