The Thesis

All PhD candidates around the world know about the thesis. You always knew about the thesis. It marks the beginning of the end for your career as a PhD and if you actually do it, you can have that cool “Dr.” title that you always wanted in your business card. What is the problem then? Why it seems so frustrating when you are sitting down to do it? The following is based on a true story, actually my story. How I managed to write it down and track my progress.

Problem Definition

A typical PhD follows a simple process: read, think, propose, publish, and the thesis. It is straightforward and one can imagine that if you are already there with the rest of the stuff, the write up would be rather easy. But it is not.

The problem lies, mostly in that writing the thesis is a lengthy and lonely act. You have to do it, nobody will come to your aid, except maybe from your advisor.

In my case, I faced the following problem; for quite some time, I could not motivate myself to write it down. I began writing and half page later, I always stopped. I tried everything, but nothing seemed to motivate me. My advisor got uncomfortable and we began talking about a method to track my progress that would motivate me.

The Idea

Then I saw it, Georgios Gousios’s Thesis-o-meter (see link below). This was a couple of scripts that posted every day the progress of the PhD in each chapter. I decided to do it myself, introducing some alterations that would work better for me.

First, I had to find a tangible way to measure the progress. I thought that was easy, the number of pages. The number of pages of a document is nice, if you want to measure the size of the text, but surely it cannot act as a day-to-day key performance indicator (KPI). And why is that? Because simply if you bootstrap your thesis in LaTeX and you put all the standard chapters, bibliography, etc you will find yourself with at least 15 pages. So, that day I would have an enormous progress. The next day, I would write only text. I think one or two pages. The other day text and I would put on some charts. This will count as three of four pages. Better huh? This is the problem.

If you are a person like me, you could add one or two figures, and say “Ok, I am good for today, I added two pages!”. This is a nice excuse if you want to procrastinate. I needed something that would present the naked truth. That would make me sit there and make some serious progress.

So, number of pages was out of the question, but I thought that we can actually use it. The number of pages will be the end goal with a minimum and a maximum. In Greece, a PhD usually has 150 to 200 pages length (in my discipline of course, computer science). So, I thought, this is the goal: a large block of text around those limits.

Then I thought that my metric should be the number of words in the text instead of the number of pages. Since, I wrote my thesis in LaTeX, I just count the words for each file with standard UNIX tools, for example with the command wc -l myfile.tex. So, the algorithm has the following steps:

  • The goal is set to 150-200 pages in total
  • Each day,
    • Count the words for all files
    • Count the pages of the actual thesis file, for example the output PDF
    • Find the word contribution for that day just by subtracting from the previous’s day word count
    • Find an average of words per number of pages
    • Finally, provide an estimation for the completion of the thesis

Experience Report

I implemented this in Python and shell script. The process worked, each day a report was generated and sent to my advisor, but the best thing was that each day, I saw the estimation trimmed down a little. This is the last report I produced:

10c10
     1899 build/2-meta-programming.tex
13c13
     1164 build/3-requirements.tex
60,61c60,61
<    13931 build/thesis.bib
    14058 build/thesis.bib
>    55747 total

---- Progress ----
Worked for 167 day(s) ... 
Last submission: 20121025
Word Count (last version): 55747
Page Count (last version): 179
Avg Words per Page (last version): 311
Last submission effort: 142

---- Estimations ----
Page Count Range (final version): (min, max) = (150, 200)
Word Count Range (final version): (min, max) = (46650, 62200)
Avg Effort (Words per Day): 184
Estimated Completion in: (min, max) = (-50, 35) days, (-2.50, 1.75) months
Estimated Completion Date: (best, worst) = (2012-08-11, 2012-12-16)

The average words per page was 311 and I wrote almost 184 words each day.

Epilogue

I wrote my thesis, but I have not submitted it (at least now, but I hope to soon), for a number of practical reasons. Still, the process succeeded, I found my KPIs and they actually led me to finishing up the work. This is a fact and now I have to find another motivation-driven method to do the rest of the required stuff. C’est la vie.

Related Links and Availability

I plan to release an open source version of my thesis-o-meter in my Github profile soon. I also found various alternative thesis-o-meters:

Original post can be found in XRDS blog

Chatbots: The Revenge of Command-Line

Recently I began hearing terms like “conversational marketing”, “chatbots” etv. and I was thinking that should be a joke or something. It was during my university years (around 1995) that windows 95 were released and the GUI paradigm won the competition against the command-line, which was more popular on unix-based system at the time.

I even remember students among several others, expressing their disappointment for the command-line of unix, and how in windows one can administer the services and the apps more efficiently. I was thinking back then that it was lack of experience, you know, it is like Leon said: “first you begin with the sniper rifle, as far as you can be from your victim, then you end up with the knife”. So, it was typical for inexperienced users to begin with the UI that had all the options exposed and the most experienced to end up with command-line that provided rapid configuration and automation.

Time has passed since then, and GUI won the war obviously. My kin (aka people that liked command-line) ended up with OS X, that has excellent apps and stability, backed by a unix-based system.

But as it seems, revenge is a dish served cold. Chatbots, which are the latest trend are backed up by artificial intelligence and provide automation through a chat interface, like slack. But the chat interface is like command-line right. So you have to memorize commands in order to receive specific functionality or information. So, all these people that hated the command-line, now through chat, they love it. And they will be very happy to use it! Cool huh.

Either way, things are very interesting these days and the progress has been real significant in applications and hardware. Now that I think about it, maybe the bash should be a chatbot right? Put a little AI in the traditional shell (and NLP). Who knows?

Meeting that Deadline, Part One

I am almost one year now at the position as CTO of e-food (http://www.e-food.gr/) and my duties involved several areas, including complete restructuring of the IT department, modernising the software stack and battling several old-demons of the current system. In our effort to address all these issues, we became more agile and hired several developers (I prefer the term developer than engineer for some reason). In one of the latest sprint review and retrospective, a newly hired developer, only 3-4 days at the job at the time, he quoted “you know, I feel there are no real deadlines here, I do not really know if this is a problem, but I cannot feel any pressure so far”. I heard his comment very carefully, I thought at the time, “what exactly did he had in mind, that we had to feel pressured to deliver things all the time? I prefer to be able to deliver things with no pressure at all” and thus I began thinking my role there, and what exactly I should do as CTO of the company.

The Short Answer

The short answer I gave to the team was that my role is to keep the pressure out. I should talk with all people in the company, make the hard choices regarding technology, give my insights, even deal with complex problems on my own, but let my team thing that there is no actual deadline. The deadlines are set during each sprint’s goals, but we should use them to organise our work and coordinate with the other departments of the company, and not to actually make ourselves obsessed with them. Of course, we should not miss them, but missing them is also meeting the sprint’s goals, right?

Did he liked my answer? I do not think so to be honest (even now) if he was convinced at the time, but I think these issues are answered due time, when things are beginning to happen and most importantly delivered and deployed.

The Distance

At the time, I was still involved in the sprints actively. The team was really small when I arrived, and I was very picky in my selection of people. I developed some critical parts of the system, in order to help the team, build some confidence and gain some velocity. But, one day, the CMO of the company told me, “why you are in the sprints? You know, I was at first, but it has no meaning. You have to trust your people and focus on higher level issues.”. Ooops, it was the second time that I got myself thinking, if my previous life as a developer made me taking wrong decisions. So, I decided to get some distance from day-to-day coding. So, effective immediately I removed myself from the sprint, getting the appropriate distance from “production” code, also from deadlines, which from now on I have to coordinate people to achieve them, and not implementing them myself.

… to be continued

Stay Tuned …

Behold, my new blog entry in XRDS: Language Bureaucracy

Laziness, impatience and hubris are the three great virtues that each programmer should have, at least according to Larry Wall [1]. My experience so far showed me that he was right. All programmers have these characteristics, if they do not, usually they are not real programmers. Since they are expressing these values with the usage of several programming languages, they tend to compare them. Usually this comparison ends up with a phenomenon called flame wars …. read more

and a small article I recently wrote and published in LinkedIn: Revenge of the SQL

The mainstream is always under attack. So is SQL and in general the various RDBMS’s that implement its various incarnations. After the golden age of MySQL and the LAMP (Linux-Apache-MySQL-PHP) stack, various NoSQL databases appeared, preaching that the data storage as we knew it was coming to an end … read more

I have a few ideas in my drawer and plan to write a few more and revisit some older ideas. Stay tuned.

Top Free App … only 13.99

If you have an old enough mac computer (with mavericks) and you try to download the “free” version of iMovie, you need to pay 13.99 euros. Nice, huh.

Maybe they could have calculated the iMovie price along with the mavericks upgrade fee, because right now it is quite funny to pay 25 euros for the operating system upgrade, and 13.99 for the iMovie only (I will comment that all the pages, number etc are also free for only the new mac computer owners).

topfree