Skip to main content

· 4 min read
at15

My friend working at a bank wanted to learn Python and I (using Java at work) wanted to learn Python again for machine learning. I thought I could learn python better by teaching another person. I figured it might help people with similar background (e.g. teach your partner who wants to try software related jobs), so I decided to write a series of blog posts to document the 'course' I designed and material used.

Syllabus

There are many python tutorial, book and open courses online, but I want to design a course that is more related to my friend's background (working at bank) so he can practice using things he learned and find the knowledge useful and get into a positive feedback loop. Otherwise, I would be forcing him to recite concepts and telling him these will be used eventually like my calculus teacher in university.

He learned programming when he was at university a decade ago and now he only remembers HTML... So the plan is to start from the very basic and cover the daily tooling of a software engineer such as using terminal, git/github, writing markdown. Tooling related with programming is a very curial part that I found missing in intro courses back at university. Version track, write document is not only important for collaborating with other people at work, it is also important for side project with only one author (and one user...) because you may pick the project up one year later and need to figure out what your were doing ASAP before you give up again.

The syllabus is divided into chapters, though length of content increases in later chapters. The schedule is 3h offline sync per week and there was no plan on when we will finish. I will adjust the content based on our progress.

Prepare development environment

  • Terminal
    • common shell commands, ls, cd, cat, vi
    • accepting input and output from command line
    • make the calculator an executable you can run like other programs
  • Git and GitHub
    • Create a GitHub account
    • Track code using git from GUI/terminal
    • Use GitHub codespace, vscode for both local and remote development
  • Use the REPL, run python file and Jupiter Notebook

Language itself

  • Hello world using a tax/mortgage calculator
    • variable
    • control flow, if, for
    • function, argument, return, scope of variables inside function
  • Basic data structure for tracking a person's monthly expense
    • list
    • dict
    • draw plot, if it works w/o Jupyter Notebook then we can still stick with cli
  • Class, model behavior of a company, e.g. generic employee, manager, worker, ceo etc.

Algorithms

  • brute force
  • complexity, O(N)
  • recursion
  • binary search (for dive and conquer)
  • dfs/bfs on graph/tree (we can use dict and switch to class later)
  • LeetCode

Maintainability

  • Test, unit test
  • Package, dependency and version management

Application

  • Web
    • Crawl other website
    • A simple web service
      • Make previous mortgage calculator a website
      • use other packages
    • Use a database, use sqlite because we introduce docker later
      • First start with implementing everything in memory and see if it will cause any issue (gone after restart)
  • Cloud
    • Docker for development and packaging
    • Deploy the website to GCP/AWS using free tier
  • ML, predict person spending, stock price etc.
    • Jupyter Notebook
    • numpy, pandas
    • scikit-learn, linear regression etc.
    • pytorch
    • llm

References

· 3 min read
at15

Why I decided to setup a blog using Docusaurus and deploy to Cloudflare Pages instead of using gatsby, netlify, or github pages.

Why use static site generator for blog when you have Medium, Wordpress?

I have used several hosted blog services such as Medium, WordPress, Ghost. These managed platforms do not require any setup and provide better exposure thanks to existing audiences on the platform. The main reason I choose to use static site generator is they allow me to write blog as code in plain markdown, version control the content using git and add style/interactive components (in React) when needed. The managed platforms offer many things out of box but their customization and API is limited e.g. Medium API is no longer supported

Why Docusaurus instead of Gatbsy, Hugo, Jekyll, etc?

GitHub pages offers Jekyll and I am not a fan of Ruby (working at AWS on region build made the feeling even worse). Building Jekyll pages locally is much painful compared with Go or NodeJS based static site generator because I don't setup Ruby toolchain on my own devices.

Hugo is in Go but the template syntax and the extensibility is not as good as these React based ones such as Gatsby and Docusaurus.

Gatsby is the one I planned to start with because it offers a GraphQL API, making building extension and interact with other languages a breeze. However, after looking at its blog template, I found there are several things I need to implement by myself such as tag. I am lazy and want most things out of box with a reasonable default that I customize later.

Then I looked into documentation type of static site generator such as Docusaurus and VuePress. I have used VuePress for awesome-time-series-database, but the VuePress blog plugin's last update time is 9 months ago so I tried Docusaurus. The default docusaurus template support both doc and blog with tags, the blog looks better than gatsby's blog template. So I decided to use Docuasurus.

Why Cloudflare Pages instead of Netlify, GitHub Pages?

GitHub Pages is the most straightforward to setup when source code of blog is hosted on GitHub and I used to use it before I switched to Netlify for PR preview which is not supported by GitHub Pages. However, Netlify's free plan has limit on 100GB bandwidth and then $55 Per 100GB while Cloudflare Pages has no limit on bandwidth.

Cloudflare Pages does have its very special 20,000 files hard limit due to technical limit. But as long as I don't copy node_modules into the build directory, it might take years for me to hit that limit with notes and blogs (116 files right now). btw: You can count the files (including subdirectories) using find . -type f | wc -l per stackoverflow. If I did hit the limit, I could host (part of) the website on other places (e.g. nginx, object store like R2/S3) and use (Cloudflare) CDN to serve the content.

In next post I will talk about the actual steps of creating the blog using Docusaurus and configure Cloudflare Pages with github integration and custom domain.