Golang Notes

In this post, we’ll start building foundational knowledge.

In computer science, and particularly programming, it’s important to get acquainted with the fundamentals before you do anything. Today, we’re going to build up the knowledge we don’t have, and start identifying and defining our “foundational concepts.” These are the things that we absolutely need to understand in order to do any meaningful work in this space.

The concepts

A great deal of inspiration has been drawn from this wiki, but I won’t address them in the order in which they were presented rather in the way I think I need to understand first. I’ve included links to the various places they live on the internet that contain definitions and other helpful information as well as my personal understanding of them.

high level programming

  • code compilation - wikipedia

    • in general, code compilation is any proces of turning one programming language into another. it’s mostly used to turn high level languages like golang or python into low level language like asm or object code to create an executable.
  • compiler optimization - wikipedia

    • this is the practice of trying to optimize the code compilation process for any of the following attributes: memory, compile time, code output size & power consumption.
  • assembly code - wikipedia

    • this is the code that computers actually know how to read. compilers mostly work by turning source code (source) into machine code (target) which is oftenly an assembly language. it’s possible, but very annoying to write assembly code directly.

computer processors

  • inter process communication - (github) (wikipedia)

    • a set of methods one can use to share data between processes. by default, processes share no memory & cannot directly communicate with each other. IPC enables that.
  • semaphore - (github) (wikipedia)

    • a data type that’s used to control access to shared data across several concurrent processes. they are related to UNIX pipes, though not specifically. in the general sense, they are required for certain kinds of complex work between multiple processes.
  • instruction cache (icache) - (github) (wikipedia)

    • a cache is somewhere you store things so that you can access them faster. an icache is a cache for a set of instructions to be run by the CPU. instructions in this context are lines of source code that are compiled to some machine code.
  • registers - (github) (wikipedia)

    • registers are caches used by the CPU. they are at the top of the pyramid in the memory hierarchy which simply means that the CPU can access their location the fastest. they are minute and their max size is typically registered in bits. as compared to RAM & hard drives which have at least (~8GB - 10^9 larger) & (~1TB - 10^12 larger) of memory available to them respectively, the memory available to registers is only (~32 bits).

compilation concepts

  • bound checks - (github 1, 2) (wikipedia)

    • a process which the CPU uses to check if a variable meets some conditions before it is used. the common type of bounds checking is range checking, where a variable is checked to see if it fits into a given type, like a uint8 (unsigned 8 bit integer) or int16 (signed 16 bit integer). another common type of bound checking is index checking, where an index to an array is checked to see if it actually fits in that array.
  • compiled function nodes / abstract syntax trees (ASTs) - (github) (wikipedia)

    • ASTs are an abstract representation of source code, generally created by some kind of parser or compiler. they represent certain elements as nodes in the tree, such as functions, if-statements & comparisons. the term compiled function node can refer to either the literal individual node in an AST representing the function, or the individual function node and all of its children in the tree.
  • inlining functions - (github) (wikipedia)

    • this concept relates to compiler optimization. parent functions call child functions - the child function can either be represented as a separate function “node” or it can be inlined into the parent function in a manner functionally equivalent to copy pasting. the dynamics of when something should vs should not be inlined are a matter of long and frequent discussion with golang, in particular this issue.
  • intrinsic functions - (github) (wikipedia)

    • these functions are special and fancy functions within a language. they have some extra tricks that leverage more low level processor resources. they’re sometimes written directly in assembly, and are often architecture specific (e.g. AMD or Intel processors).

compilation concepts with a direct impact on golang

  • static single assignment (SSA) - (github 1, 2, 3) (godoc) (wikipedia)

    • SSA is a compiler design mechanism wherein each variable that is to be assigned is assigned once and only once. in SSA, if a new value is assigned to a variable then a new version of that variable is created. SSA is primarily used for compiler optimization, since the guarantee of “a variable’s value will never change” can enable a variety of compiler optimization algos.
  • binary files / executables - (github 1, 2, 3) (wikipedia)

    • an executable is a file that can be run by a computer to do some task. they contrast sharply with data files, which do not contain instructions but information. many executables are binary files meant to be run by a machine directly, although the term executable can also be applied to the source code of scripting languages like python unlike golang.
  • *.ο object files - (github 1, 2) (googlesource) (wikipedia 1, 2)

    • object files are compiler outputs, and its contents are usually in a machine language such as binary. object files usually contain data points that a linker can use to fill in code from other places.
    • go’s compiler produces object files as one of the compiler outputs.
  • *.a files - (github 1, 2, 3) (stackoverflow)

    • Go’s *.a files are package archive files that were originally in the ar archive format. there was at some point a proposal to change them to the more standard zip file format. Archive files contain compiled package code, and additionally some debugging information.
  • file linking / static vs dynamic linking - (github 1, 2, 3) (wikipedia) (reddit) (stackoverflow 1, 2)

    • linking is the process by which an object file can request that other code be inserted into it. dynamic linking is a linking process where the linking is not truly resolved until runtime, creating a dependency on external files like (DDLs). static linking is a linking process where any links are resolved at compile time, creating a fully self-contained object.
    • static linking is definitely a better choice 99% of the time.
    • go’s compiler uses static linking, either exclusively or by default. (i’m not sure which)
  • export data - (github 1, 2, 3) (godoc 1, 2, 3)

    • simply means “the data that is exported from a package”
    • usage within golang is specific because of the standard library tools (linked above) in golang built around managing export data. those tools are responsible for the format and content of the export data.
    • that said, the term itself is fairly generic since “exported data” can mean an entire universe of things.

general computer science concepts

  • runes - (github) (stackoverflow) (blog.golang.org)
    • runes are integer values that point to particular unicode code points.
  • stack vs heap - (github) (stackoverflow)
    • the stack is an area of memory that uses the lifo pattern and is used for rapid access to small amounts of data. visualize a stack of plates to better understand the last-in-first-out concept.
    • the heap is an area of memory where access is done in an ad-hoc manner, and is used for longer term access to large data. in general, high level scripting languages (e.g ruby and python) allocate most if not all of their memory to the heap for the sake of simplicity at the cost of performance. visualize a large pile of clothes or trash to better understand the concept.

golang general concepts

  • goroutines (github 1, 2) (gobyexample) (golang-book)
    • goroutines are not threads. they’re go’s concurrency model. goroutines are to go as processes are to a CPU.
  • buffering and channels - (github) (gobyexample) (stackoverflow) (medium) (tour.golang.org)
    • channels are like IPC expect for goroutines. they allow communication and sync between goroutines.
    • buffering allows channels to act async by creating little buffers that each side can interact with without blocking.

conclusion

whew, that was a lot, right? 😄 interestingly, I was aware of some of these concepts but wasn’t able to quite define them. now equipped with these definitions, it becomes much easier to read stuff like this and understand what’s being talked about.

up next …

now that i know quite a ton about this space, i can accurately judge whether a particular github issue is a good fit for me to work on. 🤞🏾