1 ### Steps to self-host gitweb on openbsd
3 Build a simple web interface for your git repos using mostly built-in
4 functionality of OpenBSD and git.
6 **Note:** I reconstructed these steps from memory after the fact. If any step
7 fails for you as written, please email me so we can fix it.
11 All repos will have public read-only git access. We'll also create a git user
12 for write access over SSH.
15 # software deps not in base system (pandoc to render readmes)
18 useradd -m -s $(which git-shell) git
20 cat <<EOF >> /etc/ssh/sshd_config
22 AllowAgentForwarding no
28 # for cleaner read-write git urls
29 ln -s /var/www/git /home/git/repo
32 #### chroot gitweb deps
34 The OpenBSD base system contains httpd and slowcgi which we'll use to run
35 gitweb. By default CGI scripts have a chroot of `/var/www`, so any programs
36 and shared libraries required by gitweb need to be copied there.
39 # home directory for projects
40 install -g daemon -o git -d /var/www/git
42 # helper scripts from cozy forge
43 ftp -o - https://dev.begriffs.com/repo-ui/snapshot/main.tar.gz | tar zxf -
46 # copy required binaries and shared libs to chroot
49 /usr/local/libexec/git/git-archive
54 ./imprison "$path" /var/www
57 # also copy system perl modules
58 ./imprison-perl-modules /var/www
61 cp /usr/local/share/gitweb/gitweb.cgi /var/www/cgi-bin
63 # gitweb also needs a chrooted /dev/null
65 mknod /var/www/dev/null c 2 2 root:daemon
66 chmod 0666 /var/www/dev/null
68 # git hook our repos will use
69 install -D post-update /var/www/bin/post-update
72 #### generate tls certs
74 The OpenBSD base system contains acme-client(1) to retrieve https certs from
75 Let's Encrypt. There is plenty of documentation online for this step. Try Roman
76 Zolotarev's [guide](https://romanzolotarev.com/openbsd/acme-client.html).
78 When finished with this step, you'll have public and private keys in a
79 subdirectory of `/etc/ssl`, and an `/etc/httpd.conf` file for your domain.
80 You'll also have a cron job configured to renew your certificate.
82 #### configure httpd and gitweb
84 **Note:** the provided configuration files need customization for your domain
85 name and the directories you used for your certs.
88 # default gitweb css and lightweight js
89 cp -R /usr/local/share/gitweb/static /var/www
91 # back up your httpd config
92 cp /etc/httpd.conf /tmp/httpd.conf.bak
94 # install configuration for gitweb and slowcgi
97 # edit httpd.conf to customize for your domain
98 # (consulting your backup as needed)
101 # gitweb configuration
102 install -D gitweb.conf /var/www/conf/gitweb.conf
104 # update the domain name in the conf
105 vi /var/www/conf/gitweb.conf
107 # add a message to your forge homepage
108 cat <<EOF > /var/www/conf/projects_list_head.html
109 <p>Introductory content for your projects list</p>
118 All projects are stored as bare repos under `/var/www/git`. This directory is
119 within the slowcgi chroot, unlike someplace like `/home/git`.
121 The category and description of each repo as displayed by gitweb are stored in
122 plain text files within the bare repo (as part of git's database, not as
123 versioned source code). Additionally, each repo needs a `post-update` hook to
124 call `git update-server-info` whenever new code is pushed. We're using httpd to
125 serve repos over "dumb https" rather than using a dedicated git protocol server.
126 The `update-server-info` prepares files to make dumb http work.
128 Use a helper script to do all this:
131 # create the foo project, in category bar, with a description
132 ./repo-new foo bar "a wonderful project"
135 The project will expose two git URLs:
137 * read-only: https://example.com/git/foo
138 * read-write: git@example.com:repo/foo
142 To mirror dependencies on your own git server, use the helper script:
145 # for example, a hypothetical project foo on github
146 ./repo-mirror foo https://github.com/user/foo.git
149 #### block large plagiarism models
151 Hosting your own projects off GitHub from the start avoids their being
152 automatically used for training. However any public website will still be
153 scraped, including open source projects exposed by gitweb. The companies doing
154 statistical plagiarism often do not respect robots.txt and must be blocked
157 One approach is publishing poisoned urls (ones forbidden by robots.txt), and
158 implementing server-side rules to temporarily ban any IPs accessing those urls.
159 Another approach is to make the client perform a somewhat costly computation to
160 process web responses.
162 The easiest way to deter bots is using CloudFlare's new [bot blocking
164 proxy](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click/).
165 While putting a site behind CloudFlare does support the centralization of the
166 internet, it's a quick way to get started while perhaps developing another