aboutsummaryrefslogtreecommitdiffstats
path: root/cache.c
Commit message (Collapse)AuthorAgeFilesLines
* Redesign the caching layerLars Hjemli2008-04-281-68/+291
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original caching layer in cgit has no upper bound on the number of concurrent cache entries, so when cgit is traversed by a spider (like the googlebot), the cache might end up filling your disk. Also, if any error occurs in the cache layer, no content is returned to the client. This patch redesigns the caching layer to avoid these flaws by * giving the cache a bound number of slots * disabling the cache for the current request when errors occur The cache size limit is implemented by hashing the querystring (the cache lookup key) and generating a cache filename based on this hash modulo the cache size. In order to detect hash collisions, the full lookup key (i.e. the querystring) is stored in the cache file (separated from its associated content by ascii 0). The cache filename is the reversed 8-digit hexadecimal representation of hash(key) % cache_size which should make the filesystem lookup pretty fast (if directory content is indexed/sorted); reversing the representation avoids the problem where all keys have equal prefix. There is a new config option, cache-size, which sets the upper bound for the cache. Default value for this option is 0, which has the same effect as setting nocache=1 (hence nocache is now deprecated). Included in this patch is also a new testfile which verifies that the new option works as intended. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Add cache.hLars Hjemli2008-03-271-0/+1
| | | | | | | | The functions found in cache.c are only used by cgit.c, so there's no point in rebuilding all object files when the cache interface is changed. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Move cgit_repo into cgit_contextLars Hjemli2008-02-161-3/+3
| | | | | | | | This removes the global variable which is used to keep track of the currently selected repository, and adds a new variable in the cgit_context structure. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Add all config variables into struct cgit_contextLars Hjemli2008-02-161-5/+5
| | | | | | | | This removes another big set of global variables, and introduces the cgit_prepare_context() function which populates a context-variable with compile-time default values. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Introduce struct cgit_contextLars Hjemli2008-02-161-2/+2
| | | | | | | | | This struct will hold all the cgit runtime information currently found in a multitude of global variables. The first cleanup removes all querystring-related variables. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* cache_safe_filename() needs more buffersLars Hjemli2007-05-181-4/+9
| | | | | | | | The single static buffer makes it impossible to use the result of two different calls to this function simultaneously. Fix it by using 4 buffers. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Enable url=value querystring parameterLars Hjemli2007-05-181-3/+6
| | | | | | | This makes is possible to use repo-urls like '/pub/scm/git/git.git' and even add path specifications, like '/pub/scm/git/git.git/log/documentation'. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Remove troublesome chars from cachefile namesLars Hjemli2007-01-121-0/+16
| | | | | | | Add a funtion cache_safe_filename() which replaces possibly bad filename characters with '_'. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Move cache_prepare() to cgitLars Hjemli2007-01-121-22/+0
| | | | | | This moves some cgit-specific stuff away from cache.c Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Allow relative paths for cgit_cache_rootLars Hjemli2006-12-161-0/+4
| | | | | | | | | | Make sure we chdir(2) back to the original getcwd(2) when a page has been generated. Also, if the cgit_cache_root do not exist, try to create it. This is a feature intended to ease testing/debugging. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* cache_lock: do xstrdup/free on lockfileLars Hjemli2006-12-121-1/+2
| | | | | | | | | | | | | | | Since fmt() uses 8 alternating static buffers, and cache_lock might call cache_create_dirs() multiple times, which in turn might call fmt() twice, after four iterations lockfile would be overwritten by a cachedirectory path. In worst case, this could cause the cachedirectory to be unlinked and replaced by a cachefile. Fix: use xstrdup() on the result from fmt() before assigning to lockfile, and call free(lockfile) before exit. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Don't truncate valid cachefilesLars Hjemli2006-12-111-0/+5
| | | | | | | | | | | | | | An embarrassing thinko in cgit_check_cache() would truncate valid cachefiles in the following situation: 1) process A notices a missing/expired cachefile 2) process B gets scheduled, locks, fills and unlocks the cachefile 3) process A gets scheduled, locks the cachefile, notices that the cachefile now exist/is not expired anymore, and continues to overwrite it with an empty lockfile. Thanks to Linus for noticing (again). Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Avoid infinite loops in caching layerLars Hjemli2006-12-111-13/+22
| | | | | | | | | | | Add a global variable, cgit_max_lock_attemps, to avoid the possibility of infinite loops when failing to acquire a lockfile. This could happen on broken setups or under crazy server load. Incidentally, this also fixes a lurking bug in cache_lock() where an uninitialized returnvalue was used. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Fix cache algorithm loopholeLars Hjemli2006-12-111-1/+5
| | | | | | | | This closes the door for unneccessary calls to cgit_fill_cache(). Noticed by Linus. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Add license file and copyright noticesLars Hjemli2006-12-101-0/+8
| | | | Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Add caching infrastructureLars Hjemli2006-12-101-0/+86
This enables internal caching of page output. Page requests are split into four groups: 1) repo listing (front page) 2) repo summary 3) repo pages w/symbolic references in query string 4) repo pages w/constant sha1's in query string Each group has a TTL specified in minutes. When a page is requested, a cached filename is stat(2)'ed and st_mtime is compared to time(2). If TTL has expired (or the file didn't exist), the cached file is regenerated. When generating a cached file, locking is used to avoid parallell processing of the request. If multiple processes tries to aquire the same lock, the ones who fail to get the lock serves the (expired) cached file. If the cached file don't exist, the process instead calls sched_yield(2) before restarting the request processing. Signed-off-by: Lars Hjemli <hjemli@gmail.com>