As we commemorate the Internet Archive turning 25 years, I decided to unearth some memories from the most precious days of my life.
I attended Devi Balika Vidyalaya, Colombo, Sri Lanka for my high school education (2004-2012). In 2004, I joined the Junior Western Band of our school which paved the way for me to join “DBVSBB”, Devi Balika Vidyalaya Senior Brass Band in the following year. Being a senior brass band member at my school for seven years (Figure 01), I have attended many concerts, received many certificates, and won numerous competitions. Fast forward to 2021, being a Ph.D. student working in the realm of web archiving, I was keen to look for any online presence of our band’s achievements at the time through web archives.
Figure 01: A few pictures taken at the band practices and concerts over the years.
Figure 03: The article featuring DBVSBB at Sunday Observer Newspaper.
My name “himarsha” is a rare name compared to most common names in Sri Lanka. Although I haven’t heard of any other “himarsha” in our school during that time, there was a “himashi” & a “himasha” in my class. Even if it’s not a common name, I am otherwise blocked by a famous person “Himarsha Venkatsamy“, an Indian model. If you google just for “himarsha” the SERP will be filled with hits related to this model (Figure 04). On the other hand, the lengthy acronym for the band, “DBVSBB”, acts almost like a hash value. If the acronym was a shorter common acronym (say, “XYZ”) instead of “DBVSBB”, between popular “himarsha” and “XYZ”, it would be impossible to find this article featuring the band in the newspaper. Our band slogan is “Unique from the rest” and this google search made me realize that the acronym of the band is also indeed living up to that phrase.
Figure 04: A screen capture of the Google SERP for “himarsha”.
I looked for any mementos in the Internet Archive for the Sunday Observer Archives URL of the news article (Figure 05). I was able to find a single memento from Nov 16, 2016 (Figure 06).
Figure 05: Time-Map for the URL http://archives.sundayobserver.lk/2008/11/23/mag14.asp.
Figure 06: The memento from 2016 for http://archives.sundayobserver.lk/2008/11/23/mag14.asp.
Just out of curiosity, I stripped out the “archives.” off the URL and tried accessing “http://sundayobserver.lk/2008/11/23/mag14.asp”. It returned a “404 Not Found” HTTP status code (Figure 07).
$ curl -iLs "http://sundayobserver.lk/2008/11/23/mag14.asp"
HTTP/1.1 301 Moved Permanently
Date: Mon, 06 Sep 2021 04:48:34 GMT
Content-Type: text/html; charset=iso-8859-1
Transfer-Encoding: chunked
Connection: keep-alive
location: http://www.sundayobserver.lk/2008/11/23/mag14.asp
...
HTTP/1.1 404 Not Found
Date: Mon, 06 Sep 2021 04:48:35 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
...
Figure 08: Memento for http://sundayobserver.lk/2008/11/23/mag14.asp from 2008.
$ curl -s https://memgator.cs.odu.edu/timemap/link/http://sundayobserver.lk/2008/11/23/mag14.asp | grep datetime | awk '{print $1}' | awk -v FS=/ '{print $3}' | sort | uniq -c | sort -n
2 web.archive.org
$ curl -s https://memgator.cs.odu.edu/timemap/link/http://archives.sundayobserver.lk/2008/11/23/mag14.asp | grep datetime | awk '{print $1}' | awk -v FS=/ '{print $3}' | sort | uniq -c | sort -n
- Sunday Observer newspaper maintaining its own archive.
- Consistent URL structure between the original news article and the newspaper archive.
- Interaction between Google & Web Archives: Google Indexing this article from the Sunday Observer newspaper archive allowing me to use that URL as the lookup key in web archives.