How Web Applications Work

Jun 2014

(Based on a talk for RMIT CSIT)

 

“Computer science is no more about computers than astronomy is about telescopes. The question of whether computers can think is like the question of whether submarines can swim.”

– Edsger Dijkstra

 

“Humankind has not woven the web of life. We are but one thread within it. Whatever we do to the web, we do to ourselves. All things are bound together. All things connect.”

– Chief Seattle

 

I was asked to talk about huge realisations I had when learning to program. I wouldn’t say I’m very good yet but it’s everything I wish I’d been explained when I first started learning how to program and accounts for a lot of the “aha” moments.

So much of what makes computer science confusing is the vocabulary involved. I’d say the biggest thing holding back new programmers is sifting through and understanding what everything means. I think the best way to get people to learn to code is to stop teaching it like a science and explain concepts simply in small easy to digest words.

You can learn how to program and still not know how to build anything. You can also know how to build websites without knowing how to program. Learning how to build stuff and learning how to program are two separate entry points into computer science that converge later on.

The thing you normally interact with on a computer is not the computer. It is the operating system. Things that seem natural like pointing and clicking are actually not part of the computer. It is a layer of abstraction over it that makes interacting with a computer easier for a laymen. By analogy to a car, the engine is the computer whereas an operating system is the pedals and steering wheel which you use to drive the car.

The language of all of this is code. Each language has different code structure the same way there are different dialects of a spoken language. In a similar way you might say there are different dialects of code. Programming languages are written as text into text files. To write code you need a text editor which works a lot like writing. You literally just write and read code as if you would English into the text editor.

The way a computer knows to execute a file with code in it as separate from just a regular file is the suffix of the file. The ending of a file denotes the language of the code the file contains. .html means HTML. .css means CSS. .php is PHP. For a file to be executable it has to have the same suffix as its containing language. The computer reads this and understands this file contains this language and should be executed in a certain way.

When the computer executes one of these text files, it is turning the text into a program or script. If it’s written correctly without any errors, the program is then running. Everything has to be exactly right or it won’t run correctly. It doesn’t matter if you’re close enough, you have to be perfect or it won’t work. Most of what a developer spends their time doing is writing code and then fixing all of the tiny errors when it doesn’t run properly. This is called debugging and is a tedious process.

The way a computer executes a text file into a program is by compiling or interpreting the language. What does a programming language mean? It is an abstraction that represents thousands of tiny operations taking place at the binary level. So compiling or interpreting is taking the code that you write and converting it into code that the computer can read instead.

The only thing a computer can read is binary. Progressively large quantities of binary operations are grouped together and represented by statements and the notion of a programming language develops in the direction of how we understand human languages like English. The closer a programming language is to binary, the lower level it is. The higher level it is the more it resembles English.

What is happening in the computer is operations are being layered on top of each other until code starts to resemble more and more like English. So an if statement which is easy to write in 3 lines of code today might have taken 30 lines of code a decade ago. And 100 lines of code a decade before that. And maybe 300 lines of code a decade before that when programming was done with a hole puncher, stapling holes in the right patterns for the computer to read as binary in the days when computers would receive user input from instructions printed on physical measuring tape.

This is the work of thousands of programmers layering millions of lines of code onto one another. It’s like adding a new layer of soil until progressively a mountain is built. Eventually, given enough time you will be able to write programs in English. The reason it’s in English and not some other language is because the technology was invented by English speakers so their tools resemble their thoughts and they think in English. It is one of the major reasons English is slowly becoming a universal language because to add another layer of technology you have to be fluent in the one already present.

 

If you want to build for the web, it’s important to first learn the foundations of the web. What the web is constructed of is HTML, CSS and Javascript. It’s not useful learning to program without understanding those building blocks first. HTML is the structure of a web page. CSS is what makes it beautiful. Javascript is what makes it interactive.

The way you build a web application is by using a programming language to create rules by which HTML files moves around a web application. Think of it like a complex path through the web app that you are building. So when a user does something on the front end, the programming language decides what HTML file to show them and then does that. You can assign an HTML file to a variable and display that HTML file using print or echo. There. You have just programmed a website.

While on the page, more complex sites frequently use something called AJAX and is typical when there are lots of features. AJAX uses lots of Javascript to make the site more interactive. The down side is it uses up more bandwidth for the user and server. AJAX allows parts of the site to load and data to be sent in the background so it processes without you needing to click something or leave the page. So parts of the website can load information without you having to do anything. It’s frequently used in forms or whenever you need to enter data which improves the experience of the user. The site will seem smoother because you don’t have to constantly load a new page every time you click on something.

The first HTML file that displays when you visit a website is always called the index. The file is usually displayed as index.html or index.php. That is the first page displayed. If there is no index file, a website will not display. A good way to think of it is the index of all of the files that are contained within this web application. All of the files stem from this original one.

So a web app is like a giant tree where the trunk of the tree is the index file. Then all the branches of the tree spring out of the trunk. So all of the files that form the functionality come from the initial index file. They may be included into it or linked from it but it’s the easiest way of imagining a website. You could have one HTML file that includes a thousand other files and is completely natural. This would probably be a really complicated website.

An HTML file contains links to the CSS and Javascript files so when programming you only really focus on moving around HTML files. If an HTML file was a person, including CSS and Javascript would be like putting on clothes to make it look better. Raw HTML pages aren’t very pleasant to look at. You store all of your code which represents functionality in different text files. Then on the server you group and store similar files in different folders. So you will have a CSS folder which will contain all of the CSS files, a Javascript folder will contain all of the Javascript files et al. To make a website accessible over the internet you put all of these files on a server.

When a person is on a website they’ll notice that most parts of the site look pretty similar like the logo, the buttons, the colour scheme. This is a concept called modularity. Basically modularity means that a page will use all the same HTML, CSS and Javascript code to make every page in a website remain and behave consistently. Basically, you write the HTML, CSS and Javascript code once and then access by including it in separate files. It’s not uncommon for a file to only be a few lines long because it is modular. It just uses functionality used in other files to display the web page.

 

Anything with an internet connection capable of hosting files and displaying them over the internet can be a server. It means a computer can be a server, a smartphone can be a server. But there are special computers designed for the sole purpose to be servers and are owned by hosting companies. They have special paths and algorithms to send data on. Computers designed to be servers usually remove anything unnecessary, like an operating system, to make them faster. It can’t serve any task except being a server.

To understand how a website sends information to you, it’s first worth understanding how the internet works. The internet is basically N numbers of these servers that host data. N is a stupidly large number, probably in the hundreds of millions or billions and it’s constantly increasing as the internet is getting larger and more people are connecting to it.

Everything you see on the internet is hosted on a server somewhere. And the server has to be on all the time. The moment the server switches off, you won’t be able to see any of the data on it anymore. This is why servers are extremely important because without them nothing can exist on the internet. Because an unfathomably big network of servers is the internet. Most of the data contained on servers are websites.

What is happening behind the scenes when you access the internet are packets of data are being sent from your computer, routed through the telecommunications network you buy your internet from, and then sent to the server which hosts the website you are visiting. Then the website uses that data in their web app and returns something to you. The server then sends a packet of data which is routed back through your telecommunications network and is processed by your browser and displays as a change in the webpage.

So using a website is kind of like 2 computers; a browser and a server talking to each other. And the language they are talking in is data and code. But the server is having millions of the same conversation with other browsers which are other people visiting the website. Their conversations are different because the data other users send is different.

When you do something on a website like fill out a form, all the data is grouped into packets and this is the path that data packet is following. Billions and trillions of data packets are flying through the internet every second. And the speed of this data flying around is what you are referring to when you talk about how fast your internet is. It’s why some companies have faster internet or better coverage. It’s because their telecomms networks are better.

The thing which routes these data packets are predictably called a router. A router basically just calculates the fastest way of getting a data packet from your browser to a server and back to your browser again. If you can imagine data packets like flowing water, routers would be like levers that can change the flow of the water to help it get to the ocean faster.

These packets though are just data. And the packets can be intercepted by other malicious programs. I like to imagine it like a game of rugby where the routers are trying to get the ball over the line without it getting intercepted. The routers are trying to keep the data safe but also to get it to the server and back as fast as possible. One of the tools they use is encryption. Encryption scrambles the data in the packet so only the user and the website server can figure out what it is. Data packets are encrypted over secured internet connections while in unsecured internet connections they aren’t.

Because there is physical data packets being sent lightning fast, the closer the server geographically is to you means the faster the website will display. A server in the US will display a website faster to people in the US than a server in Australia. This is why most big companies actually have servers all around the world and they copy their website onto each server to deliver it to visitors from different countries as fast as it can.

It also means websites that have less data on them will load faster and is why programmers are constantly trying to because it makes their websites faster. The speed of a website is directly related to how much money it earns so being fast has a huge benefit. You would probably stop using a website if it took 10 minutes to load and so the company would lose your business or traffic.

 

To prevent excessive processing, your browser stores most of the information sent to it from a server locally on your computer in what’s called a cache. So when the browser needs something it’s already used before, instead of needing to visit the server all over again, it just grabs it from the local copy stored in the cache. So it stores a small copy of the website locally with all the static content.

Static content is usually all the HTML, CSS and Javascript. It is typically anything that does not need to bring information from a database or a file. If it does need to take information from a database or a file then it is called dynamic content as it dynamically takes thing out of a database in the same motion as it displays it.

The way you navigate and traverse a website is via links on the web page. What happens when you click a link, in the couple of seconds before the next page is displayed, is the programming language figures out where you should be sent, what HTML file to display to you, connects to a database and takes data out of the database from disk and stores it in memory, populates that data by slotting it into the HTML, the HTML draws on CSS and Javascript to make it pretty and interactive, then displays it as one fluid document which you see as a dynamic web page. All of that happens in a second.

There is another archetype for a web app as Javascript is becoming more powerful as a language. And that is to have a single HTML page which is the entire web app and then change what gets displayed and the user sees using Javascript depending on what the user does and how they interact with it. This is starting to become really popular with apps that require a lot of fast interactivity such as Email clients. Lots of trips to a database would make these apps really slow.

But how does the user get to this web app? The way is by entering the address of the website in the URL of the browser. Every website actually displays over an IP Address which is a series of numbers that denote the location of the server. Everything connected to the internet has an IP Address. Your computer does, the server does and they look like this: 192.0.0.1. When a user enters those numbers into the URL, they are taken to the address of the server. But most of the time they don’t need to, they just enter a domain name instead.

A domain name is a word that has some suffix extension like .com, .net, .org etc which masks the IP Address via something called A Name and C Name Records. There isn’t really a huge difference between them but they allow you to hide the series of numbers which is the IP Address and instead display the domain name which is a word legible to the user. If there weren’t domain names, every website would show in the URL as a bunch of random digits.

Think of a domain like the filepath to a web app. The domain tells the browser to visit a server at which point the index file displays and you see the website. The way the internet developed originally was as a giant folder you could connect to from lots of different machines. That was version 1 of the internet. Over time more people started using and extending it until you have what you see today which has gotten quite complicated.

The catch is you have to buy this word from a big domain company to then own whatever that domain name points to. You can point the domain name to any server you want. Domains are a bit like toll booths that give people access to the ability to host things on the internet for a price. But they’re important because it makes it easier for people to remember how to get places. Instead of remembering 66.220.159.255 you just remember Facebook.com

 

When you finally get to programming it’s best to start with syntax. There is no point learning to program until you’ve learnt the syntax first because it forms the skeletal structure for your understanding of programming in a language. Syntax is just the rules of the language. You can’t learn how to write Chinese until you first learn what all the symbols mean and how to use them.

When using a programming language to build web applications, my huge realisations were about functions, = ! ” ” ++ – – == operations, if else statements, loops, cookies, variables, comments, arrays, HTTP Methods and classes. The trick is to intertwine all of these by nesting each within another to create bulk processes and programs that do a lot without writing a lot of code.

The same way in math the building blocks are addition, subtraction, multiplication, division, decimals and the alphabet. Each concept is very simple and straightforward. But the moment you start interweaving and layering them together, very quickly you get really complicated stuff like Linear Algebra and Calculus.

Both are complicated but use the simple operations in ways that make your mind boggle. A similar thing happens for programming. These simple structures are layered onto each other to create complex programs. And everything you do on a computer is a program. The means by which you access the internet is via a web browser which is in itself a program. And these programs are pretty much all built using the above constructs. They are the lego blocks which programmers use to build complicated buildings.

 

Functions are the most important things ever. They’re like writing something big now and assigning it to a smaller thing that you can use any time in the future that you want. So instead of spending time over and over again writing the large thing, you just use the smaller one. So you could have some complicated equation, but then create a function for it. Every time you need to use that equation, instead of having to write it all out, you can use the function.

= and ! are probably the most important thing. Equals in programming is not the same as equals in math. In math the equal sign means the resultant as in 2 + 2 = 4 (2 + 2 results in 4), but in programming the equals sign means assignment. So you could say X = Y. And in your script whenever you need to reference X, it will be the same thing as Y. And the exclamation mark means Not. So when you want to say something is not the same as something else, you use the !. So X != Y means X does not equal Y. And in your script whenever you need to reference X, it will not be the same thing as Y. This distinction is hugely important.

” ” are used to separate text from code and is called a string. It’s how the computer knows what is text and what is executable code. Whatever is in the quotation marks, the computer will print as text. So if you want your program to say something to the user, you put this in quotation marks. So in PHP you could say Echo “Hi, I’m Sam” and the program will output Hi, I’m Sam to the user. In web apps, whenever a page is telling you something, in the script that text is contained in a string.

++ means increment, – – means decrement and == is equals or resultant. A good way to think about ++ is just add 1 each time. Similarly – – is just subtract 1 each time. So if something has i++, it means i+1, then i+1+1, then i+1+1+1 etc. Or if it is i- -, it means i-1, then i-1-1, then i-1-1-1 etc. == means equals in the sense that we already think about equals, the math version of 2 + 2 = 4. In programming you would need to say 2 + 2 == 4. Whereas just a single = means assignment.

if and else statements are kind of like putting junctions in your program. They create forks in the road where your program can go one path or another path depending on what happens. An if statement is a condition with which the next thing happens. It means if X then do Y. And an else statement is an exception. It’s like saying otherwise. It kind of means if X, then do Y else (otherwise) do Z if you can’t do Y. So you could say if X = Y, execute code, else, execute code. So if X does equal Y, the first code will run but if X does not equal Y the second code will run.

Loops are just super fast ways of doing the same thing over and over again while changing minor details. They are macro rules for conducting an operation N number of times. That operation can be anything contained within the loop. There are 3 major kinds of loops: for loops, while loops and do while loops. When you start a loop, you also need to write code to say what is going to happen while the loop is running.

A for loop means for i, do something. So you could say for (i = 0, i < 100, i++), execute this code. What you have done is created a condition called i. And you’ve said i equals 0, but is increasing by 1, and you want your code to run until i is less than 100. So your for loop will then execute the code 100 times. Your condition can be anything and is you telling the for loop how many times to run your code.

A while loop is kind of like the opposite of a for loop. It means while i, do something. So while something is true or not true, your code will execute. For example you could say while (i > 5, i < 10), execute this code, i++. So your code will execute, then i will increase by 1 and your code will run 5 times. The core part of this is what won’t happen. Your code will not execute until i is 5 and then it will stop when i is 10. It will only execute while i is between 5 and 10. It’s like a delay or parameters you can set.

The do while loop means do X while i. This is basically the same as the while loop except your code will execute at least once first, and then it will continue to execute while your while loop is running. Basically the only difference is your code runs first. Then the loop starts. So you could say do X while (i = 1). So your code will run and if i = 1 it will keep running but if i does not equal 1 it will immediately stop running. This is useful for scripts when you want to check things.

What is really easy to do with loops is to create an infinite loop and these are bad. It’s when you don’t set the upper limit to a loop or the logic is wrong. So the loop just keeps running forever. If you said do X while (i > 5, i++). If you don’t stop i from increasing or end your loop. Your code will run forever and will break your program. This is commonly how programs crash. When someone writes a bad loop.

A really useful way of using loops is by utilising what’s called a Boolean argument. Anything that is Boolean must be either true or false. It can’t be anything other than true or false. So you can say while something is true, run this code. Or while something is false, run this code. What gets amazingly complicated is when you use multiple loops inside the same bit of code. So you can have your while loop run for every time your for loop runs for every time your do while loop runs while something is true or false. This will do something that will seem like magic.

Cookies are the web equivalent of the stamp or wristband nightclub owners and concert organisers put on the wrists of their patrons. It’s the way they keep track of who is who and who has done what and who has paid and who hasn’t. Now imagine if the stamp or wristband was smart and updated itself with everything you did in the concert. That’s what a cookie is and they’re basically the same as sessions.

Variables are like internal storage devices within a programming language so you don’t forget what you are working on. They will end up being the bane of your existence. Assigning things to a variable is like creating a mini checkpoint in a video game. You can use that variable in the program as shorthand for using whatever you had assigned to the variable. Using something you’ve already done for something new later on.

As a rule of thumb, everything that is important should get assigned to a variable. So you could have a paragraph of text or some code you’ve written before. Instead of needing to write out the paragraph or code again when you want to use it, you can assign it to a variable. So you just use the variable instead when you want to use it.

Code is not intuitive to read. Especially when there are different styles of programming. Object oriented, procedural, imperative, functional et al are all styles of programming that require different mindsets. This is why you use English comments within a program to explain what a section of code does. When the program runs, the comments get automatically removed but are only present so that other programmers can read and understand your code quickly. A programmer who doesn’t comment his code well is almost universally hated.

Arrays are huge lists which contain information within a program. Whenever you deal with large amounts of data in a script, it should be contained within an array. The reason is because an array has a super fast way of looking up things because it creates a tiny index of all the data in the array. So you can quickly use the Nth item in the array without actually knowing what it is.

There are two HTTP methods GET and POST which are the protocol by which information is sent from the browser through a web app to the server. It’s like a tunnel by which data travels through, to and from the users browser and the server. Without the tunnel the data can’t travel between these two points. Most of the time these HTTP methods are used for user input, the most common of which is via forms. Get and Post are also how files on the server send information to other files on the server. Such as when a person adds products from a shopping website to a cart.

When an application wants to talk to an outside application. Say when a shopping website wants to send data to a bank when charging a users credit card for buying an item. They also use GET and POST request HTTP methods. They will both use an API, which is an Application Programming Interface. An API is a way for a web app on a server to send and receive data to other web apps on other servers. So one web app will POST data to another web app which will GET the data.

Classes are like big categories of things. Each individual thing in that category is called an object. I’ve always imagined it like walking into a public library where they tag different books by their subject like Fiction, Non-Fiction, Childrens etc. Every new book slots into one of the subjects. If there isn’t a subject for a type of book, then they create a new subject for it. That’s kind of how classes work. The subject where all the books are kept is the class and each book is an object within the class. I know they’re important but I don’t use them.

 

I’ve kind of skipped over all the object oriented stuff. The main reason is because I don’t like OO programming, I think it’s not relevant most of the time. Designing a program in an object oriented fashion is preparing it for large numbers of users because all of the benefits of OO only happen at scale.

The major principle is everything that you will probably use over and over again is turned into a class and an object. Like the ability for a user to create an account will be a class and have the user as an object and then every new user will be a new user object. It’s useful for creating huge applications but requires knowledge that is prohibitively complex. Primarily it’s difficult to understand and get right.

Graduating to OO programming is like going from a bicycle to a motorbike. It’s way more powerful. The difference is if you crash on a bicycle there is no real harm done but if you crash on a motorbike you can create a huge wreck. But if the goal is to travel huge distances then it is easier to have a motorbike than a bicycle.

When creating a web application there are common design patterns that are used for the architecture of it. Most are a subset of a way of thinking about the program and are extensions of 3 methodologies. To rely heavily on files, functions or classes. Within each is a subset of their own design patterns and many use a combination of all 3.

1) Everything is a file. Code is written into lots of different files with each file containing individual functionality. 2) Everything is a function. All functionality is converted into a function and you have one master file containing a list of all of the functions. 3) Everything is a class. And you manipulate objects within classes to create the functionality. 2 and 3 require more thought before starting out.

Everytime you enter data into a web app you are using an HTML form. Facebook is a site with thousands of tiny forms everywhere on the page that allows you to type and enter data into it. The form then stores that data into a database.

 

What is a database? It is a specialised storage space on a server which allows you to store data on disk. When you are working on something, that data is stored in memory. After you are finished it is stored on disk. So to begin working on something the computer takes it from disk and stores it in memory until you are done, at which point it stores it back on disk. These are both places where data is stored. Disk is where data is stored permanently whereas memory is where data is stored temporarily.

The difference between disk and memory is that disk is like the brain and memory is like the mind. When something is stored in your brain it’s there forever but when it’s in your mind it’s temporary until you think of something else. So you take what’s in disk and store it in memory to use it. The same way to use your thoughts you take what is stored in your brain and bring it to your mind.

There are different types of databases but the most popular one is MySQL. The reason is because of SQL which is a special language that can only be used inside a database to manipulate the data stored in it. SQL tells the database what to do with data and where to store. Because you can’t see the data in a database, SQL is the way you interact with it via queries.

The structure of a MySQL database is very easy. If you ever played Battleship growing up or used Microsoft Excel, it will seem very intuitive. It’s basically a table with columns and rows, like a chessboard. The catch is, unlike Battleship which is just letters and numbers, you can name the columns and rows whatever you want. Each table is like a separate chessboard that contains different columns and rows.

This means you can store any kind of information you want. Whenever you add data it takes up one of the squares in the chessboard. And you can refer to the square by referring to the name of the column and row. The same way in Battleship you can hit a square by choosing A12 would then shoot the ship that is parked there.

You could for example set the columns titles to be information about a person. And you could set the row titles to be the user number. So the rows will be user1, user2, user3 etc while the columns might be name, age, height. So under the user1 row and the height column would be the square containing the data that is the height of user1. Usually the first column of the database is something with an incrementing number and is what you use to keep track of all the information. Otherwise it would become unwieldy very quickly.

A database is also stored on a web server. For small websites, usually the same server the website is served from. For bigger websites they will have a separate server dedicated to it. One server will the website, another server will be the database and they will just add more servers to either end as the website becomes bigger and bigger.

When you create an account or store any information on a website, it is stored in a database just like this. Most websites these days are in fact just pretty interfaces to a database. The data is stored here and the website displays it to you in an intuitive way. Databases by themselves are very ugly. It is just a table full of data.

Inside a table you can have other tables. You can have as many tables as you want within a table. So you could have 100 tables inside 1 table, and then another 1000 tables inside each of the previous 100. Each of those tables has rows and columns. Thus the amount of information you can store in a MySQL database is infinite. It is precisely as much storage as you have on disk.

Usually inside the programming language on a webpage is an SQL query telling the database where and how to store the data. So the programming code displaying an HTML form will also contain some SQL code telling the database where to store the information. This can’t be seen on the front end, only the back end.

SQL is also primarily how attackers break into websites. They write an SQL query into a form and can take all of the information if the form is not protected properly. I could write a query that deletes all the customers from your database and would ruin all your hard work getting those customers. It’s why you want to keep the names of your columns and rows and tables complicated but also easy to understand. This is to make it difficult for attackers to guess what the correct names are.

The problem with SQL, and also the thing that makes it slow, is that when you retrieve information from a MySQL database. The query literally goes through all of the information one by one until it finds the right one. So if you had 10 columns, representing the type of information you collect, and 100,000 rows, representing the number of users you have in a database. The query will literally go through every one of them until it finds the column and row containing the data an SQL query is looking for.

I’ve always imagined it like throwing rocks over water. The rock skips along the water until it finally lands. SQL is similar. When you select data from a table, the SQL query will go through all of the data in the table individually until it finds the right one. It’s why as the amount of data increases, the performance becomes worse because the SQL queries have to sift through more information to get there.

Thus the smaller the data in the table, the faster it can be retrieved. You can also speed this up by storing less information or storing it in more tables or by adding more databases or database servers. The slowest part of a web app is connecting to the database and how long the SQL query takes to move data around. Because most of your data is stored in a database, the easiest way to improve performance and speed up a slow web app is to reduce the number of database calls. One of the easiest ways of doing this is by creating a cache and an index.

A cache is like having a complete copy of something that is just closer to you, so instead of making the effort to go get the real thing, you can use the copy which is closer. An index is like having a huge list of something prepared so you don’t have to search for it, you just find it in the index. If you had a book and you wanted to find one particular word, you could scan each individual page for it or you could create an index of all the words and refer to that instead which is faster.

Most applications have a front end and a back end. The front end is the part of the website you see and interact with and includes the HTML, CSS, Javascript, AJAX. The back end is the part of the website you can’t see and is the functionality. It includes the programming code, server and database that makes up the web app and its behaviour. The back end is arguably the more difficult as evidenced by the order of magnitude more technologies comprising it. There are hundreds of different programming languages and server and database configurations but only one core HTML.

A common question is how long does it take to build something. It depends on what it does, but a good rule of thumb is everything takes at least triple the time to build than you think it will. This is because most of the problems you won’t even know you’ll face let alone how to fix them. When you start building something, in the process of building it you’ll think of new things. In fact I’d say most of the good ideas come after you’ve already started building something.

And also most of this functionality has already been written somewhere. You almost never need to write all of this from scratch. Someone else has done the work and created a library. So just use the library.

 

Now if we put this all together in a hypothetical example. Let’s take something everyone is familiar with. An online shopping cart.

“A user is viewing a handbag on an online shopping website. They add the handbag to cart and then checkout and pay for it using a 10% off handbags discount voucher if buying 2 or more handbags.”

Here’s how it works from a programming perspective.

What happens is the user is on the index page. The image of the handbag and the price and description will be stored in different rows of the database. So to display it, the page is probably making 4 MySQL database calls to get this information. The index page is displaying the information from the database by parsing it through the HTML .html files, the CSS .css files and Javascript .js files to make the page look pretty so the user is more likely to buy something.

This grabbing of information from the MySQL Database and showing it on the page will be added to a function. So the same functionality doesn’t need to be re-written over and over again on every product page. A function is written once, and every page will now behave in exactly the same way.

The header, sidebar and footer of the page are modular so they are separate files. Maybe header.php, footer.php, sidebar.php and product.php. When you view the handbag, you are probably on product.php which is using the include function to display the code from the other files. This helps the page look similar with every other page.

If this is object oriented programming. Which is usually the case for big shopping websites. The actual handbag will be an object of a product class that represents all the products. All the products are individual objects of the one class. So when different handbags are displayed, it calls the product class multiple times, inserting different information each time from the database.

Now when you hit add to cart, there will be an IF statement for if the button is pressed, a POST request is sent from the file product.php to another file called cart.php, which will have a GET request to receive the request. If it successful, the handbag will have successfully been added to the cart and the user will now be on a new page cart.php. Now when they fill in the details, cart.php will store their information and information about this order in the database. On cart.php will also be a Form for all the users shipping information and details.

Now in the cart you will be able to change how many handbags you want to buy. And whether or not you have a discount code. This is when the loops come in and it gets complicated because they will be nested within each other. An IF statement will be there, if there is a handbag in the cart, – a FOR loop will be running, for each handbag in the cart – a WHILE loop will be running, while the handbag is in the cart, – another IF statement, if there are more than 2 handbags, apply the discount, – an ELSE statement will be used, if there are not more than 2 handbags, do not apply the discount.

Now when the user presses a checkout button, the current file cart.php will send another POST request to another file called checkout.php which has a GET request to receive the information. The page checkout.php will now be encrypted so the user information is protected and is where the person enters the credit card information.

Once they hit Buy Now and pay for the order. The file checkout.php will send a POST request to the Bank Account that the user is with which will have an API that receives lots of GET requests to receive the payment information. If the information is recieved successfully, the bank charges the credit card and pays the handbag store.

Details of the sale will be stored with the handbag shop. They will look up the details and then ship it to the customer. You will receive the handbag you just bought.

And so you very quickly see how web applications with thousands of products and lots of users can have thousands of files and tens of thousands of lines of code to be the functionality that seems really simple to the user who just wants to buy a handbag.

 

We programmers take all of this for granted but most of this was revelations in the minds of their inventors. They were huge technological breakthroughs and was cutting edge in the 90s. My favourite example of this is copy, cut and paste. It seems so obvious to us now, but this was a huge innovation when it was introduced to computers. It came from typewriters in the literary and writing world. While expensive at first, today it is homogenous to the point we forget it was once innovative.

All new technology initially starts as only something the rich can afford to use. But Moore’s law accelerates both speed of the technology and also reduces the cost of technology. So very very quickly, new technology becomes affordable for everyone. Stopping something from being built or opposing it because it can’t be used by 99% of people is just being shortsighted. Eventually it gets cheap enough to use by everyone. The internet and all web applications started this way.

The internet is just like a humongous library. It is in fact the perfect library. All the internet does is store information and then allow people access to that information. It took the added leap that libraries previously had not done by allowing normal people to add information to this huge collective.

In the Great Library of Alexandria existing in 300 BC, scholars were encouraged to write scrolls to be included into the library thus advancing the collective human knowledge. The internet is a faster, more networked version of the same thing. It’s a huge library. We take it for granted but imagine how smart people would be if they were constantly connected to all surviving human knowledge that has ever existed? That is exactly what we have.