Tag: AI
Generative AI 2024 Retrospective
Generative AI 2024 Retrospective
2024 witnessed a parade of increasingly more powerful AI model releases, culminating in OpenAI’s groundbreaking “o3”. While I’m certain these advancements really are significant, I don’t think they have translated to noticeable improvements for the average user.
What people are paying attention to, though, are the really amazing product features that are coming out of Anthropic and Google. I’ve heard a lot about people building software Claude’s Artifacts. And Google’s Deep Research product, released a few days ago, has just completely changed search.
Running Your Own LLM UI
This is a review of Open WebUI, an extensible and user-friendly self-hosted WebUI for LLMs.
I recently decided to run my own UI layer for LLMs, as I have some exciting ideas. For those who are not familiar with chatbots, their architecture basically looks like this:
While I’m pretty good at customizing the middleware, I’m not as skilled at customizing the UI layer.
I tried writing my own UI at first, but I quickly gave up when I realized it was beyond my skills as a frontend developer. The next thing I did was look into open-source solutions. There were a lot of choices, but I narrowed them down to these three:
OpenAI and ChatGPT
I had two weeks off at the end of 2022. Telesign closed its operations on the last week of 2022 to give employees well-deserved time off for a year of hardwork and I took another week off in addition to that. Like many technologists, I became captivated by OpenAI’s release of ChatGPT in 2022, and I spent a lot of the last two weeks exploring what OpenAI has to offer.
ChatGPT is a chatbot developed by OpenAI. It has been widely recognized for its impressive capabilities, such as imitating human writing, transforming plain English into code, and making glaringly stupid mistakes. The underlying technology of ChatGPT is a machine learning model known as GPT-3. This model is designed to predict how a human might continue a previous piece of text. For example, given what I’ve written so far, GPT-3 predicted what the rest of this blog post will look like:
Tag: Programming
My thoughts on AI in 2023
OpenAI took the world by storm in 2023 with the relase of ChatGPT. Other companies quickly followed suit, introducing their own competitors to ChatGPT. In this update, I want to write down my thoughts on some of the biggest players in the field of generative AI.
When I use the term “generative AI,” I am referring to a computer program that can generate text, images, or other types of content. These programs can understand natural language and follow complex instructions. One notable example of a generative AI is OpenAI’s “GPT-4,” commonly known as ChatGPT. Another well-known generative AI is Midjourney, which specializes in generating images from text descriptions. In this update, I will focus on AI with text output, as that is the area I have the most experience with.
Use GPT-3 To Build A Code Translator
Once you know a programming language well, the process for learning a new language is not very hard. It is just time consuming. You need to read the documentation for basic syntax and flow control, get familiar with its idioms, memorize core parts of the standard libraries, and learn its tool chains. What can we do to speed up the learning process? One thing we can do is provide great examples in the documentation. Can we do better? What if you have working examples for every problem you encountered? What if you can describe your intents in a familiar language and see how it should look in a new language? As it turns out, GPT-3 is really good at this task.
OpenAI and ChatGPT
I had two weeks off at the end of 2022. Telesign closed its operations on the last week of 2022 to give employees well-deserved time off for a year of hardwork and I took another week off in addition to that. Like many technologists, I became captivated by OpenAI’s release of ChatGPT in 2022, and I spent a lot of the last two weeks exploring what OpenAI has to offer.
ChatGPT is a chatbot developed by OpenAI. It has been widely recognized for its impressive capabilities, such as imitating human writing, transforming plain English into code, and making glaringly stupid mistakes. The underlying technology of ChatGPT is a machine learning model known as GPT-3. This model is designed to predict how a human might continue a previous piece of text. For example, given what I’ve written so far, GPT-3 predicted what the rest of this blog post will look like:
Unit Tests vs Integration Tests
I used to favor automated integration testing. Now, I find myself walking away from it. I no longer find it worth the cost of setting up and maintaining such tests. I now rely mostly on unit testing.
First, I need to define what I meant by those terms. In context of this post, unit tests
In Wikipedia
unit test | integration test |
---|---|
automated, not always | not specified |
ranging from entire ‘module’ to an individual function | modules tested as a group |
depends on execution conditions and testing procedures | depends on unit test/implies modules are already unit tested |
From Martin Fowler
Dell XPS-13 - Developer Edition
This Feburary, I ordered a Dell XPS-13 Developer Edition. The Developer Edition is a line of Dell laptops that ships with certified Linux OS. It has been my programming powerhouse for the last eight months.
My first choice for a laptop was not Linux. Windows and macOS are simply more practical. Games and MS Office just works on Windows. Software support on macOS is generally good (except games), but it beats Windows with its underlying BSD architecture. There is better hardware support simply because more people use it. However, I ultimately ended up with a Dell and Linux and it worked out great.
Vocabulary
- bind
- Sometimes a synonym for “map”
- conflate
- combine
- disjoint
- separate, e.g., odd numbers and even numbers are disjoint
- disjunction
- inclusive or – if one of the inputs is
True
- extrinsic
- not intrinsic
- federated
- Top-down delegation of responsibilities; Has a single point of failure at the top.
- PID controller
- control loop feedback mechanism
- e.g., curise control
- 99%ile
- Abbreviation of percentile
- EBNF
- Extended Backus-Naur Form
- Useful for defining the syntax of a programming language
- EBNF terminal
- a token/word/chunk in EBNF
- Alpha
- Αα
- Beta
- Ββ
- Gamma
- Γγ
- Delta
- Δδ
- Commonly denotes ‘difference’
- Epsilon
- Εε
- Used in Greedy-Epsilon algo for multi-armed bandit problems
- Error margin in floating point comparisons.
- Zeta
- Ζζ
- Eta
- Ηη
- Theta
- Θθ
- Iota
- Ιι
- Kappa
- Κκ
- Lambda
- Λλ
- Mu
- Μμ
- Nu
- Νν
- Xi
- Ξξ
- Omicron
- Οο
- Pi
- Ππ
- Rho
- Ρρ
- Sigma
- Σσς
- Tau
- Ττ
- Upsilon
- Υυ
- Phi
- Φφ
- Chi
- Χχ
- Psi
- Ψψ
- Union
- ∪
- Aleph
- ℵ
- Symbol for cardinal numbers. ℵ is pronounced as Aleph-null.
- Empty Set
- ∅
- Such That
- Commonly represented as a colon,
:
- Example,
D={x^2|x ∈ N, x >=1, x <= 4}
. This reads D is the set of all x^2 SUCH THAT: 1) x is a natural number; 2) x is greater or equal to 1; 3) x is less than or equal to 4. - Intersection
- ∩
- Subset
- ⊂ or ⊆
- e.g., if A = {1,4,9} and B = {1,4}, then B ⊂ A (B is a subset of A).
- Belongs To
- ∈
- ∉ Means “not belong”.
- To say that 1 belongs to S, we write 1 ∈ S.
- e.g., if A = {1,4,9} and e = 4, then we say e∈A, meaning “e belongs to A”. However, one would not say e⊂A – e is a single element, not a set. Similarly, if B = {1,4}, one would not say B∈A or “B belongs to A”, as B is a set not a single element.
- Complements
- Difference between two sets
- Relative Complement
- A\B means objects that belong to A and not to B. i.e., {1,2,3}{3} == {1,2}
- Omega
- Ωω
- P(A|B)
- The likelihood of event A occurring given that B is true.
- P(A^C)
- The probability that A doesn’t happen
- Precision Recall
- Precision = probability that some retreived doc is relevant; Recall = probability that some relevant doc was retreived.
- Narrow Integration Tests
- exercise only that portion of the code in a service that talks to a separate service; uses test doubles of those services, either in process or remote; thus consist of many narrowly scoped tests, often no larger in scope than a unit test (and usually run with the same test framework that’s used for unit tests)
- Broad Integration Tests
- require live versions of all services, requiring substantial test environment and networ access; exercise code paths through all services, not just code responsible for interactions
- Balanced Binary Search Tree
- For example, red-black tree or AVL tree.
- Natural Numbers
- ℕ
- double-struck N
- Cardinal numbers,
- Complex Numbers
- ℂ
- double-struck C
Related
Go io/fs Design (Part I)
As usual, LWN has a good write up on what’s going on in the Go community. This week’s discussion in on the new io/fs
package. The Go team decided to use a Reddit thread to host the conversation about this draft design. LWN points
out that posters raised the following concerns:
- We added status logging by wrapping http.ResponseWriter, and now HTTP/2 push doesn’t work anymore, because our wrapper hides the Push method from the handlers downstream. / It becomes infeasible to use the decorator pattern more
- Doing it “generically” involves a combinatorial explosion of optional interfaces
Ultimately, Russ Cox admits, “It’s true - there’s definitely a tension here between extensions and wrappers. I haven’t seen any perfect solutions for that.”
Unit tests and system clock
It took me way to long to learn this. Your code (and their unit tests) should inject the system clock as a dependency.
An example, let’s say you have a service that writes a record to the database with the system clock.
public void save(String userName) {
long currentTimeMs = System.currentTimeMillis();
User user = User.builder()
.name(userName)
.updateTimeMs(currentTimeMs);
database.save(user);
}
How would you test this? You can inject a mock database instance and use it to verify that it got a User object. Great! You can verify the username is as expected. How do you verify that tricky business rule that updateTimeMS is the “current time”?
Go Project Organization
Here’s a rough layout of how I organize my Go project. Some parts are situational and some parts are essential. I’ll go over both in this blog.
A rough layout:
+ basedir
+-- go.mod (module jcheng.org)
+-- hello (empty)
+-- log/
+-- utils/
+-- config/
+-- models/
+-- repositories/
+-- services/
+-- cmd/
+-- hello_app/
+--/cmd/
+-- speak/
+-- email/
+-- sms/
The basedir
Situational.
Dependencies
Some past self version of me is saying, every class and function should be explicit about their dependencies, so that
they are easily testable. John0
would say, “If you have a service that talks to a database, the database client should
be an explicit dependency specified in the constructor. This makes the code easily testable.”
There is another version of myself from 10 minutes ago arguing it’s foolish to be explicit about everything. He’d point to this piece of code he’s just looked at:
Tag: Leadership
Book Review: Turn The Ship Around
My notes from Turn the Ship Around! A True Story of Turning Followers into Leaders.
- Early in the book, Marquet tells a story of having failed to empower officers under his command. The story of being unable to “empower” stuck with me. It reminded me of experiences, earlier in my career, of failing to empower people reporting into me.
- On empowerment, Marquet said “Empowerment is not how I want to be managed.” and “Empowering others feel manipulative. I believe people are empowered by nature.” This is often how I felt about empowerment.
- If not empowerment, then what? While he doesn’t mention this specifically, the way I interpreted his leadership style throughout the book is that he applied the Situational Leadership style.
- Marquet tells a story of being reprimanded after coming up with a brilliant ruse to sink an enemy submarine (during practice). The moral of that story is that a great plan which is too complex for others to execute is, in fact, a bad plan.
- tldr; KISS
- His story of being assigned to Santa Fe, of feeling dejected and finding the motivation to carry through is very inspiring. It is also a good example of how to be a good leader – his commander applied great leadership skills to help Marquet through it.
- On being told he’d have the full support of his command officer to succeed, then being told… “But, I don’t think it’s a good idea if you ask for A, B, and C” gave me a new perspective on “support.”
- One view of support is that “I trust you to do this” and “I think you are the best person for this job.”
- There are good ways (better than I have been doing) to communicate “these are the limitations to what we can provide to you.”
- Watch out for signs of low morale! People avoiding mistakes, meeting the minimum requirements, and “do whatever they tell me to do.”
- In chapter 13, Marquet tells a story of giving more responsibility to department heads too soon. In chapter 17, he talks about the importance of training in order for delegation to succeed. It feels like these two points should’ve been tied together more closely. Though I’m really just nitpicking now.
I suspect, if you are in the tech industry where there is already a lot of talk about autonomy and ownership, you like already understand the theme of this book. If you believe that people need to be led or that a great personality is needed to inspire others, then you should check this out; It will offer you a useful counter perspective.
Delegation
One thing about job hopping is that you get to experience new perspectives on how people work. It forces you to reconsider what’s obvious. Take delegation for example. I used to think anyone with a bit of experience can do it effectively. It’s just about decomposing a system and assigning components to different people, right? It turns out effective delegation is much more nuanced.
Situational Leadership
Situation Leadership is a model that ascribes different delegation styles based on the competency (skill) and commitment (motivation) of each team member. Assuming motivation is not a factor, skill is the sole determinant on how much you should direct an engineer. A highly competent engineer requires very little direction. An inexperienced engineer requires specific and concrete directions.
Tag: Untagged
Book review - Bitch: On the Female of the Species
Bitch is a book from zoologist Lucy Cooke on correcting misunderstandings and cultural biases in our biological science. She explains, via animal studies, how existing notions of sex, male/female roles, and “what is natural” have been incomplete, if not incorrect.
In Bitch, Lucy goes through the behaviors of lemurs, meerkats, hyenas, moles, orcas, elephants, bonobos, termites, birds and fish to challenge conventional wisdom on nature and the roles of male and females. This alone makes for an engrossing read. I cannot help but to be delighted by learning brand new things about nature and animals. It brings me back to being a child again. What makes things better is her amusing way with words. (I am jealous.) Some choice phrases from the book:
Pithy Sayings
- Chaos theory
- When the present determines the future, but the approximate present does not approximately determine the future. - Edward Lorenz
- Code
- If you can’t write it down in English, you can’t code it - Peter Halpern, via Jon Bentley
- Prioritize
- There is no such thing as two equally urgent projects, only priorities that haven’t been made clear yet. - Unknown
- Regular expression
- Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. - Jamie Zawinski
- Science
- When people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together. - Isaac Asimov
Optimisim
This is a story from Neil DeGrasse Tyson’s StarTalk podcast. He talked about being on Brian Cox’s show and discussing the futures of space travel. Neil was saying that chemical rocket engines are not going to cut it and we need to look into things like wormholes and warp drives. Then Brian cuts in and explains that wormholes are fundamentally unstable and they won’t work.
Brian was absolutely correct, but that was not the point. The point was about what the audience said. Cox’s audience said to them, “That’s why the Americans discover everything! They are always so optimistic!” (Cox is a British phycisit and Neil is an American physicist).
Staff Plus Live 2021
Staff Plus Live 2021 was a virtual conference held by LeadDev.com on Sept 14, 2021. I believe this is their first conference aimed at Staff+ Engineers. Loosely defined, Staff+ are high-impacting engineers whose success is felt across the company. In other words, these are engineers who creates high leverage. I was able to attend this year’s conference and learned a lot from it. I started this post to write down what resonated with me before I forget about them.
Post of the Week 2021-08-29
Just the act of measuring something can change outcomes. I love the reference to Jespen.
Post of the Week 2021-07-25
Back from a long vacation. This week, we have folk wisdom on visual programming
https://drossbucket.com/2021/06/30/hacker-news-folk-wisdom-on-visual-programming/
Post of the Week 2021-06-03
Humans and incentives, and why some metrics (or objectives and key results) should not be public.
Post of the Week 2021-05-13
My thoughts today led me to reading about nuclear wastes. It turns out that what we call “nuclear waste” is a pretty vague term.
A light-water reactor generates different kind of “waste” from a molten salt reactor. A traditional light-water reactor doesn’t consume fuel efficiently. It leaves behing partially consumed fuel that is not economically attractive to refine for further use. These partially spent rods continue to generate heat and are sometimes stored in cooling pools. Wastes from light-water reactors contain plutonium, which can be used to create atomic weapons.
Post of the Week 2021-04-06
Grammar visualizer
https://dundalek.com/grammkit/
- Which leads to a JS parser generator
- And also leads to a JS parser generator with a visual debugger
https://ohmlang.github.io/editor/
This also prompts my interest into a graphviz/dot parser
https://github.com/awalterschulze/gographviz
Which uses this Go parser generator
https://github.com/goccmack/gocc
This exploration also led me to this Go parser generator
Posts Of The Week 2021-04-01
I spent a couple of hours evaluating 3rd party libraries. What have I learned? For me, there’s one clear winner in a small field of candidates.
Presently, these are the top hits for “golang gauge counter timer”.
- https://pkg.go.dev/github.com/go-kit/kit/metrics
- https://pkg.go.dev/github.com/facebookgo/metrics
- https://github.com/uber-go/tally
The first result is go-kit. Go-kit isn’t a metrics library. Rather, it bills itself as a “framework for building microservices.” Its metrics package is simply a set of interfaces. You then refrence one of the many sub-packages with concrete implementations. As a consequence, it’s go.mod file is pretty huge.
Posts Of The Week 2021-03-25
Go does not allow cyclic imports. A solution is to create a “shared” package to hold interfaces that related packages all reference. This, for some reason, reminds of me join tables in SQL.
Here is an example of a typical Go project. Packages toward the bottom, e.g., “common/persistence”, allow different packages to work with each other without introducing cyclic dependencies. For this project, “log” can be referenced by “config”, but cannot use “config” to conifgure itself.
Posts Of The Week 2021-03-11
Oldie but goodie. Go concurrency patterns
https://drive.google.com/file/d/1nPdvhB0PutEJzdCq5ms6UI58dp50fcAN/view
Posts Of The Week 2021-03-04
Two book recommendations
- https://www.oreilly.com/library/view/designing-data-intensive-applications/9781491903063/
- https://www.oreilly.com/library/view/making-software/9780596808310/
Designing Data Intensive Applications: Don’t let the name fool you. The knowledge in this book applies to more than just data processing applications but to distributed systems in general. It is a great book for all software architects.
Making Software: What Really Works, and Why We Believe It: A meta-analysis of various theories on software development processes. Is TDD effective? Is Agile just a hype or is it just misused? Which code metrics are actually useful? There are evidence-based answers to many of these questions.
Posts Of The Week 2021-02-18
The Peseverance rover lands on Mars. In this month, the UAE, PRC, and US all sent scientific instruments to Mars.
https://www.nytimes.com/2021/02/18/science/nasa-peseverance-mars-landing.html
Posts Of The Week 2021-01-29
TextRank identifies connections between various entities in a text, and implements the concept of recommendation. A text unit recommends other related text units, and the strength of the recommendation is recursively computed based on the importance of the units making the recommendation. In the process of identifying important sentences in a text, a sentence recommends another sentence that addresses similar concepts as being useful for the overall understanding of the text
Posts Of The Week 2021-01-22
This blog is such a great example of why it is difficult to creat great software. So often you have to make impossible choices between security and backward-compatibility.
Today’s Go security release fixes an issue involving PATH lookups in untrusted directories that can lead to remote execution during the go get command. We expect people to have questions about what exactly this means and whether they might have issues in their own programs. This post details the bug, the fixes we have applied, how to decide whether your own programs are vulnerable to similar problems, and what you can do if they are.
Posts Of The Week 2021-01-08
Sascha Chua is a great resource on Emacs-y things
Posts Of The Week 2020-12-31
Maybe Emacs doesn’t need to be a fusion reactor. I only hope it continues to generate energy for many years to come.
It just needs volunteers to keep the fire going.
Posts Of The Week 2020-12-18
Replicating a database can make our applications faster and increase our tolerance to failures, but there are a lot of different options available and each one comes with a price tag. It’s hard to make the right choice if we do not understand how the tools we are using work, and what are the guarantees they provide (or, more importantly, do not provide), and that’s what I want to explore here.
Posts Of The Week 2020-12-11
One strategy is a poison pill: a special message on the queue that signals the consumer of that message to end its work. To shut down the squarer, since its input messages are merely integers, we would have to choose a magic poison integer (everyone knows the square of 0 is 0 right? no one will need to ask for the square of 0…) or use null (don’t use null). Instead, we might change the type of elements on the requests queue to an ADT…
Posts Of The Week 2020-11-27
Meet GPT-3. It Has Learned to Code (and Blog and Argue).
For many artificial intelligence researchers, it is an unexpected step toward machines that can understand the vagaries of human language — and perhaps even tackle other human skills.
Posts of the Week 11/06/20
https://rootsofprogress.org/immunization-from-inoculation-to-rna-vaccines
When you get your covid shot (probably in 2021), take a moment to think back on the 300 years of progress that got us to this point.
https://github.com/wbolster/emacs-python-black
This is an Emacs package to make it easy to reformat Python code using black, the uncompromising Python code formatter.
Posts of the Week 10/08/20
What are your favorite CLI apps?
I’m looking for CLI utilities that are definitely not part of the POSIX required or optional utilities, and more coloquiallly not considered to be standard BSD or *nix fare.
My daughters couldn’t stop lauging during storytime. They actually enjoy bedtime now. Birdsall is an excellent writer.
This is a whirlwind tour of writing parsers by hand. Why would you want to do that, when tools like Yacc exist to do it for you?
2020 Emacs User Survey
I recently participated in 2020 Emacs User Survey. One of the questions asked is “When you were first learning
Emacs, what did you find difficult to learn?” The obvious answer is keyboard shortcuts, e.g., instead of CTRL-S
for
save, it is CTRL-X CTRL-S
. Instead, CTRL-S
performs find, which is usually mapped to CTRL-F
, and so on and so
forth.
Why Emacs is hard for new users
There were other problems too. I didn’t put them all down in the survey. I’ll jot them down here as they come to mind.
Webcam on Linux (Logitech Razer Kiyo)
I recently got a Logitec Razer Kiyo for my Zoom meetings. Currently, I need to use it on on Linux Laptop, and it
works rather fine. I was skeptical that it would “just work” on Linux, but it did! There is also a software package
named v4l-utils
that allows you to configure the zoom (crop) level of your video, which is nice for cutting out
undesirable background.
apt-get install v4l-utils
You can get a listing of attached cameras with
Posts of the Week 10/23/20
How to hire for your organization
A three-part series from 9/21/20 to 10/5/20 on how to hire for your organization.
The Tempral Workflow Framework
Let’s look at a use case. A customer signs up for an application with a trial period. After the period, if the customer has not cancelled, he should be charged once a month for the renewal. The customer has to be notified by email about the charges and should be able to cancel the subscription at any time.
Dell XPS-13 - Developer Edition
This Feburary, I ordered a Dell XPS-13 Developer Edition. The Developer Edition is a line of Dell laptops that ships with certified Linux OS. It has been my programming powerhouse for the last eight months.
My first choice for a laptop was not Linux. Windows and macOS are simply more practical. Games and MS Office just works on Windows. Software support on macOS is generally good (except games), but it beats Windows with its underlying BSD architecture. There is better hardware support simply because more people use it. However, I ultimately ended up with a Dell and Linux and it worked out great.
Posts of the Week 10/08/20
The Ultimate Guide to GPT3 from Twilio
A step-by-step walk through of what it is like to the the GUI to GPT3 from OpenAI
A Columnist Makes Sense of Wall Stree Like None Other (See Footnote)
Mr. Levine wasn’t always a darling of business media and finance Twitter. (The best measure of his audience’s devotion may not be his 112,000 Twitter followers, but rather the 3,000 that follow @MattLevineBot, a fan account describing itself as a bot that mimics his writing style.) He began his post-collegiate career as a Latin teacher, then worked as a lawyer at Wachtell, Lipton, Rosen & Katz before advancing to Goldman. Despite having made more money at white-shoe law and Wall Street firms than he does as a writer, Mr. Levine says he is happier now. He is doing exactly what he has long wanted to do. This is the story of his ascension. It begins with an escalator.
Creating New Abstractions
Software engineers find it natural to talk about abstractions. We have ideas such as Decorators, Model-View-Controller, and Message Queues. These abstractions allow software engineers to talk to each other using our own rich and succint language. Abstractions, however, is not unique to engineers. In my role as a parent and an American citizen, I am constantly confronted with new abstract ideas arising out of life.
The abstractions I am referring to are new words and ideas coming out of our shared culture. For example, words that do not exist twenty years ago: Me-Too and BLM as well as a redefinition of words like gender and socialism.
Post(s) of the Week Sept 2020
The Empathy Gap from Effectiviology.com
For example, if a person is currently feeling calm, the empathy gap can cause them to struggle to predict how they will act when they’re angry. Similarly, if a person who is on a diet is currently full, the empathy gap can cause them to struggle to assess how well they will be able to handle the temptation to eat when they’re hungry.
Tag: Go
Interpreter in Go - 4
In Chapter 1.3 of [Ball’s Writing an Interpreter in Go][1], we encounter one design decision of his Monkey programming language. Here, the lexer has a NextToken() method that looks like this:
func (l *Lexer) NextToken() token.Token {
var tok token.Token
switch l.ch {
// [...]
default:
if isLetter(l.ch) {
tok.Literal = l.readIdentifier()
return tok
} else {
tok = newToken(token.ILLEGAL, l.ch)
}
}
// [...]
}
This means the lexer itself does not do backtracking. The meaning of a character at any point cannot be ambiguous. You cannot say, for example, that ‘+’ is the ‘plus’ token unless it is in the middle of a variable name. I don’t know many programming languages that support such behavior – so it is probably an acceptable design decision. You know what they say, “Keep it Simple, Smartypants”. I’m not sure if there are other notable constraints introduced by the design at this point, but it is something that tickles my overly analytical brain.
Interpreter in Go - 3
A lexer takes the source code, a sequence of characters, and group them into tokens. e.g., it makes the first decision
on how to process the strings 100-10
, -100-10
, and -100--100
into groups. I’m going to call this grouping
“tokenization” even though I may be misusing the term.
Tokenizing source code is hard. How should -100--100
be tokenized? Should it be a literal -100 followed by the minus
token, followed by another -100?
Interpreter in Go - 2
Writing an Interpreter In Go by Thorsten Ball will be my personal introduction to writing an interpreter. I’ve never taken a comp sci class before, so I know nothing about compilers. On a lark, I decided to explore this area now, nearly 20 years after I started to learn computer programming.
If you are interested in this book as well, you might might the AST Explorer a useful companion.
I was told as some point in the past, that compilation can be broken down into four stages:
Interpreter in Go - 1
It happened. At the recommendation of https://twitter.com/dgryski, I bought Writing an Interpreter In Go. This will be my next hobby project. It’ll be interesting to see if I ever finish it.
Posts Of The Week 2021-04-15
Lots of small things today.
History
“Those who cannot remember the past are condemned to repeat it”. I love reading about how software came to be the way they are today.
Before we can talk about where generics are going, we first have to talk about where they are, and how they got there.
https://cr.openjdk.java.net/~briangoetz/erasure.html
Go Errors
Error handling is still jacked in Go 1.16. That is, the formatting change is still not present. Why is this a problem? There are two use cases for errors. Error as values, which can be inspected programatically, and error printing, which is not meant for programmatic consumption.
Workflow Orchestration - Part 3 (How do I use this?)
In this part of the series, we’ll write some hands-on Temporal code and run it. Let’s start with our requirements:
You need to transmit a data packet. You can choose from multiple Route Providers to do this. Transmission takes time – you will be notified on a callback URL when the packet is delivered. Delivery may fail – either because the acknowledgement was not sent or arrived late (because Internet). You should try the next provider when one fails.
Posts Of The Week 2021-04-01
I spent a couple of hours evaluating 3rd party libraries. What have I learned? For me, there’s one clear winner in a small field of candidates.
Presently, these are the top hits for “golang gauge counter timer”.
- https://pkg.go.dev/github.com/go-kit/kit/metrics
- https://pkg.go.dev/github.com/facebookgo/metrics
- https://github.com/uber-go/tally
The first result is go-kit. Go-kit isn’t a metrics library. Rather, it bills itself as a “framework for building microservices.” Its metrics package is simply a set of interfaces. You then refrence one of the many sub-packages with concrete implementations. As a consequence, it’s go.mod file is pretty huge.
Posts Of The Week 2021-03-25
Go does not allow cyclic imports. A solution is to create a “shared” package to hold interfaces that related packages all reference. This, for some reason, reminds of me join tables in SQL.
Here is an example of a typical Go project. Packages toward the bottom, e.g., “common/persistence”, allow different packages to work with each other without introducing cyclic dependencies. For this project, “log” can be referenced by “config”, but cannot use “config” to conifgure itself.
Posts Of The Week 2021-03-11
Oldie but goodie. Go concurrency patterns
https://drive.google.com/file/d/1nPdvhB0PutEJzdCq5ms6UI58dp50fcAN/view
Go: Pointer vs Value
In A Tour of Go, it states “Go has pointers. A pointer holds the memory address of a value.” When you design your data structure in Go, you have to decide between using a pointer or not. There’s no clear rule of thumb for it.
I had been reading the Go source code to AWS’s client library for DynamoDB. For a while, I had been annoying with their API design, which looks like this:
Go io/fs Design (Part I)
As usual, LWN has a good write up on what’s going on in the Go community. This week’s discussion in on the new io/fs
package. The Go team decided to use a Reddit thread to host the conversation about this draft design. LWN points
out that posters raised the following concerns:
- We added status logging by wrapping http.ResponseWriter, and now HTTP/2 push doesn’t work anymore, because our wrapper hides the Push method from the handlers downstream. / It becomes infeasible to use the decorator pattern more
- Doing it “generically” involves a combinatorial explosion of optional interfaces
Ultimately, Russ Cox admits, “It’s true - there’s definitely a tension here between extensions and wrappers. I haven’t seen any perfect solutions for that.”
Localstack S3 and Go
I spent too much time Saturday getting the Go S3 SDK to work with LocalStack.. It turns out that if you are using LocalStack, you need to explicitly configure the following properties:
sess, err := session.NewSession(aws.NewConfig().
WithEndpoint(endpoint).
WithRegion(region).
WithS3ForcePathStyle(true))
The requirements for Endpoint
and Region
are obvious. If S3ForcePathStyle is not specified, then LocalStack will
fail.
data, err := svc.GetObject(&s3.GetObjectInput{
Bucket: &bucket,
Key: &cfgKey,
})
What is path-style? In May 2019, Amazon deprecated path-based access model for S3 objects. This means one should no longer use URLs of the form:
Gomock Tutorial
I’ve been using mock/Gomock
to write tests in my personal project. When you’re building something in a new
language, it is hard to prioritize learning every tool in your toolchain. For me, I’ve been writing custom and
suboptimal code for Gomock because of a nifty but undocumented API call .Do
.
In many cases, I wan to match subsets of a complex object while ignoring irrelevant parts. e.g., verify a function is
invoked with a list of User
objects, but only verifying the email addresses. To do that in a generic
way, I wrote a custom Matcher API that uses text/template
to describe what parts of the object to match. Thus, my
mock-and-verify code looks like:
Go Project Organization
Here’s a rough layout of how I organize my Go project. Some parts are situational and some parts are essential. I’ll go over both in this blog.
A rough layout:
+ basedir
+-- go.mod (module jcheng.org)
+-- hello (empty)
+-- log/
+-- utils/
+-- config/
+-- models/
+-- repositories/
+-- services/
+-- cmd/
+-- hello_app/
+--/cmd/
+-- speak/
+-- email/
+-- sms/
The basedir
Situational.
Mocking in Go Part 2
In a previous post, I talked about Gomock. I want to spend a bit more time on it.
As I’ve mentioned, setting up Gomock requires
- Download mockgen
go get github.com/golang/mock/mockgen@latest
- Edit your go.mod
module jcheng.org
go 1.13
require (
github.com/golang/mock v1.4.0
)
- Add the go:generate incantation to your code
//go:generate mockgen -source $GOFILE -destination mock_$GOFILE -package $GOPACKAGE
package pastself
import (
"time"
)
...
- Use mocks in your test case. In this example, I used a callback function to match on a nested property.
type fnmatcher struct {
d func(x interface{}) bool
desc string
}
func (self *fnmatcher) Matches(x interface{}) bool {
return self.d(x)
}
func (self *fnmatcher) String() string {
return self.desc
}
func On(desc string, d func(x interface{}) bool) mock.Matcher {
return &fnmatcher{d: d, desc: desc}
}
// TestSendOverdueMessages_ok is an example of using matching using a callback function
func TestSendOverdueMessages_ok(t *testing.T) {
ctrl := mock.NewController(t)
defer ctrl.Finish()
mockUserRepo := NewMockUserRepository(ctrl)
mockMessageRepo := NewMockMessageRepository(ctrl)
mockMessageSender := NewMockMessageSender(ctrl)
log, _ := NewBufferLog()
r := NewPastSelfService(
mockMessageRepo,
mockMessageSender,
mockUserRepo,
log,
)
senderUserDDB_1 := &UserDDB{UserID: "sender1"}
senderUserDDB_2 := &UserDDB{UserID: "sender2"}
recipientUserDDB_2 := &UserDDB{
UserID: "recipient2",
Profile: UserProfileDDB{
PreferredEmail: "recipient2@example.com",
},
}
resultSet := &ResultSet{
Items: []Message{
{
ID: 1,
SenderID: "sender1",
Body: "body1",
Recipients: []string{"email://recipient1@example.com", "recipient2"},
Headers: []Header{
{Name: "x-header-key-1", Value: "value-1"},
{Name: "x-header-key-2", Value: "value-2"},
},
},
{
ID: 2,
SenderID: "sender2",
Body: "body2",
},
},
}
mockUserRepo.EXPECT().GetUser("sender1").Return(senderUserDDB_1, nil)
mockUserRepo.EXPECT().GetUser("sender2").Return(senderUserDDB_2, nil)
mockUserRepo.EXPECT().GetUser("recipient2").Return(recipientUserDDB_2, nil)
mockMessageRepo.EXPECT().FindOverdue(mock.Any(), 0, "").Return(resultSet, nil)
matchFrom := On("user.UserID==sender1", func(x interface{}) bool {
if y, ok := x.(*UserDDB); ok {
return y.UserID == "sender1"
}
return false
})
matchTo := On("[]Recipient{recipient1@example.com,recipient2@example.com}", func(x interface{}) bool {
if y, ok := x.([]Recipient); ok {
return len(y) == 2 &&
y[0].Email == "recipient1@example.com" &&
y[1].Email == "recipient2@example.com"
}
return false
})
matchOutMessage := On("OutboundMessage.Body==Body1", func(x interface{}) bool {
if y, ok := x.(OutboundMessage); ok {
return y.Body == "body1"
}
return false
})
mockMessageSender.EXPECT().Send(matchFrom, matchTo, matchOutMessage).Return(nil).MaxTimes(1)
r.SendOverdueMessages()
}
One thing nice about GoMock is that it generates statically typed method names for the EXPECT()
statements, so the
compiler can check that you’re using the correct method names. Neat.
Few Annoying Things about Go
As promised, a few thoughts about Go that is annoying.
No Generics
Code generation is tightly coupled with the tool chain. When I need to use code generation, i.e., enums and mocks, the use case can be solved with generics instead.
No lambda syntax
Scala has a lambda syntax to make it easy to work with anonymous function objects
val numbers = Seq(1, 5, 2, 100)
val doubled = numbers.map(
n => n * 2
)
Java has a similar syntax
Mocking in Go
Today I want to talk a little about the testify and gomock packages and doing mocks in Go.
Mocking is essential when writing code that depends on external services, e.g., microservices, message brokers, and datastores. For many people, this means web applications: My system has business rules and talks to multiple services. How do I test the business rules without setting up all the services locally?
Some people argue that mocks create a false sense of test coverage and introduce blind spots. I think that’s partially true but also too simplistic. In many cases, engineers gain sufficient value from testing “glue code” to make mocking worthwhile. This follows the principle of don’t let perfect be the enemy of good.
Why I like Go
I have a few side projects, published and unpublished. By trade, I’ve come to programming as a Java programmer. I started coding directly against the Servlet API and have coded using Enterprise Java Beans, Play Framework, Spring Framework, and even some Scala. Lately, I’ve been coding mainly in Go with some Python.
Go is a really nice programming language. Some people like it for its simplicity. I want to offer a different take on why you should learn and use Go for your own projects.
Viper Examples
Viper is a configuration library fo Go. It has been my go to library for configs. Some of the best features of Viper are:
- Ability to unmarshal subsets of a configuration file into a Go struct
- Ability to override contents of a configuration file using OS environment variables
Here’s a brief example that demonstrates both features:
Assume you have a configuration file at $HOME/server.toml
resolve_dns = true
http_keep_alive = 5
[example_com]
doc_root = "/var/www/example_com"
allow_files = "*.html"
login_required = true
login_lockout_count = 3
[example_org]
doc_root = "/var/www/example_org"
allow_files = "*.html, *.jpg"
login_required = false
login_lockout_count = 5
Then you can utilize the configuration file using this snippet:
Go1 13 Errors
For me, Go 1.13 arrived with anticipation of better error handling. Presently, the second Google search result for Go 1.13 error hanlding is an article that refers to the “xerrors” package. One feature of the xerrors package is that it produced errors with a stack trace showing where the error came from. The addition to Go 1.13, however, did not include this particular feature, and left me spending frustrating hours trying to debug the loss of stack trace after switching from xerrors to the standard library.
Own Your Data
I previously wrote about owning my own data. An important part of data ownership is backing up your data. I use S3 as my long term data store. It is pretty easy to set this up using Terraform.
S3
Provisioning a S3 bucket is simply a single Terraform resource:
resource "aws_s3_bucket" "repo_archive_log" {
acl = "log-delivery-write"
bucket = "example-bucket"
tags = {
Name = "example"
TTL = "persistent"
ManagedBy = "Terraform"
}
}
Tag: InterpreterInGo
Interpreter in Go - 4
In Chapter 1.3 of [Ball’s Writing an Interpreter in Go][1], we encounter one design decision of his Monkey programming language. Here, the lexer has a NextToken() method that looks like this:
func (l *Lexer) NextToken() token.Token {
var tok token.Token
switch l.ch {
// [...]
default:
if isLetter(l.ch) {
tok.Literal = l.readIdentifier()
return tok
} else {
tok = newToken(token.ILLEGAL, l.ch)
}
}
// [...]
}
This means the lexer itself does not do backtracking. The meaning of a character at any point cannot be ambiguous. You cannot say, for example, that ‘+’ is the ‘plus’ token unless it is in the middle of a variable name. I don’t know many programming languages that support such behavior – so it is probably an acceptable design decision. You know what they say, “Keep it Simple, Smartypants”. I’m not sure if there are other notable constraints introduced by the design at this point, but it is something that tickles my overly analytical brain.
Interpreter in Go - 3
A lexer takes the source code, a sequence of characters, and group them into tokens. e.g., it makes the first decision
on how to process the strings 100-10
, -100-10
, and -100--100
into groups. I’m going to call this grouping
“tokenization” even though I may be misusing the term.
Tokenizing source code is hard. How should -100--100
be tokenized? Should it be a literal -100 followed by the minus
token, followed by another -100?
Interpreter in Go - 2
Writing an Interpreter In Go by Thorsten Ball will be my personal introduction to writing an interpreter. I’ve never taken a comp sci class before, so I know nothing about compilers. On a lark, I decided to explore this area now, nearly 20 years after I started to learn computer programming.
If you are interested in this book as well, you might might the AST Explorer a useful companion.
I was told as some point in the past, that compilation can be broken down into four stages:
Interpreter in Go - 1
It happened. At the recommendation of https://twitter.com/dgryski, I bought Writing an Interpreter In Go. This will be my next hobby project. It’ll be interesting to see if I ever finish it.
Tag: Tags
Posts Of The Week 2021-04-15
Lots of small things today.
History
“Those who cannot remember the past are condemned to repeat it”. I love reading about how software came to be the way they are today.
Before we can talk about where generics are going, we first have to talk about where they are, and how they got there.
https://cr.openjdk.java.net/~briangoetz/erasure.html
Go Errors
Error handling is still jacked in Go 1.16. That is, the formatting change is still not present. Why is this a problem? There are two use cases for errors. Error as values, which can be inspected programatically, and error printing, which is not meant for programmatic consumption.
Tag: Orchestration
Workflow Orchestration - Part 3 (How do I use this?)
In this part of the series, we’ll write some hands-on Temporal code and run it. Let’s start with our requirements:
You need to transmit a data packet. You can choose from multiple Route Providers to do this. Transmission takes time – you will be notified on a callback URL when the packet is delivered. Delivery may fail – either because the acknowledgement was not sent or arrived late (because Internet). You should try the next provider when one fails.
Workflow Orchestration - Part 2 (Why do I care?)
An increasingly distributed and fragile world
Workflow platforms are important because software engineers are increasingly adopting distributed systems in their architecture. There are two reasons for this change: 1) Users are demanding more frequent releases, feature teams, better peformance, and higher availability; 2) Providers are increasingly moving away from “use our library” (Spring Framework) to “use our APIs” (AWS, Azure, and GCP).
This change is undoubtedly a good thing, however, it also introduces new problems. It is much harder to trace program execution in a distributed system. A business process can span multiple services, created by multiple teams, in a variety of programming languages. There are more ways for things to fail, less consistency in code quality and documentation, and it’s harder to understand what happens when things go wrong.
Workflow Orchestration 1 - What is a workflow?
Introduction
It is hard to describe what a Workflow Platform is. It is both familiar and exotic. There are aspects of the problem space we all know well: Retries, eventual consistency, message processing semantics, visibility, heartbeating, and distributed processing to name a few. Yet, when they’re all put together in a pretty package with a bow tied on top, it becomes something almost magical. It feels like seeing the iPhone for the first time: Of course you want a touch screen on a cellphone and mobile internet access. Similarly, a workflow platform feels like the only natural way to solve problems. Once you learn it, anything else feels as clunky as using a feature phone.
Tag: Python
Posts of the Week 11/06/20
https://rootsofprogress.org/immunization-from-inoculation-to-rna-vaccines
When you get your covid shot (probably in 2021), take a moment to think back on the 300 years of progress that got us to this point.
https://github.com/wbolster/emacs-python-black
This is an Emacs package to make it easy to reformat Python code using black, the uncompromising Python code formatter.
Unit tests and system clock
It took me way to long to learn this. Your code (and their unit tests) should inject the system clock as a dependency.
An example, let’s say you have a service that writes a record to the database with the system clock.
public void save(String userName) {
long currentTimeMs = System.currentTimeMillis();
User user = User.builder()
.name(userName)
.updateTimeMs(currentTimeMs);
database.save(user);
}
How would you test this? You can inject a mock database instance and use it to verify that it got a User object. Great! You can verify the username is as expected. How do you verify that tricky business rule that updateTimeMS is the “current time”?
Tag: Docker
Docker With Custom DNS Server
By configuring your Docker containers to talk to a custom DNS server, you gain more control over how your container look up other services – database, other microservices, etc.,
It turns out Docker networking isn’t completely straight forward. On Linux you can:
- Run a DNS server locally, either dnsmasq or devdns
- Run your container with
--dns 172.17.0.1
, the magic IP where your host machine is located - Configure dnsmasq from step 1 via
/etc/hosts
Sadly, none of it was easy. The entire thing probably took me three hours.
Running Private Docker Registry
I still find it hard to believe how easy it is to run your own infrastructure in the cloud.
Running a Docker registry is as simple as adding a few lines of code to your Terraform configuration.
resource "aws_ecr_repository" "foo" {
name = "bar"
image_tag_mutability = "MUTABLE"
image_scanning_configuration {
scan_on_push = true
}
}
When deployed, this create an registry where you can manage multiple Docker repositories. You can upload a Docker image to be used in your private ECS cluster:
Tag: AWS
AWS Step Functions
I briefly worked on a workflow system at $past_job
, implemented using AWS Step Functions. My experience was pretty
terrible. I wasn’t sure which technical requirements led the team to this system. Some people said we needed a “system
that is configuration driven and not code driven” and some people said “we needed something that scales.” Whatever the
reason was, making improvements to this system was a pain in the ass, with AWS Step Functions itself being somewhat
responsible.
I want to love DynamoDB
I want to love DynamoDB. I love that it just scales (disk usage and processing power). I love that it is tightly integrated with AWS’s IAM model, so I don’t have to deal with user/role/permissions management.
But DynamoDB does some weird things by design. For example, only the primary key can be unique. If you want a table with
multiple unique attributes, for example, a Users
table where both the user_name
and email
are unique, you’ll have to do
weird things like this.
Localstack S3 and Go
I spent too much time Saturday getting the Go S3 SDK to work with LocalStack.. It turns out that if you are using LocalStack, you need to explicitly configure the following properties:
sess, err := session.NewSession(aws.NewConfig().
WithEndpoint(endpoint).
WithRegion(region).
WithS3ForcePathStyle(true))
The requirements for Endpoint
and Region
are obvious. If S3ForcePathStyle is not specified, then LocalStack will
fail.
data, err := svc.GetObject(&s3.GetObjectInput{
Bucket: &bucket,
Key: &cfgKey,
})
What is path-style? In May 2019, Amazon deprecated path-based access model for S3 objects. This means one should no longer use URLs of the form:
Own Your Data
I previously wrote about owning my own data. An important part of data ownership is backing up your data. I use S3 as my long term data store. It is pretty easy to set this up using Terraform.
S3
Provisioning a S3 bucket is simply a single Terraform resource:
resource "aws_s3_bucket" "repo_archive_log" {
acl = "log-delivery-write"
bucket = "example-bucket"
tags = {
Name = "example"
TTL = "persistent"
ManagedBy = "Terraform"
}
}
Tag: Math
Deriving Average Without Totals
Even an old dog like me learns something new every day.
Let’s say you need a program that receives a sequence of numbers and output the current average. How do you do this efficiently? The naive solution is to to keep track of two variables:
total # add up all numbers seen so far
count # a count of how many numbers seen so far
This way, at any step, you can calculate the current average using total/count
.
Vocabulary
- bind
- Sometimes a synonym for “map”
- conflate
- combine
- disjoint
- separate, e.g., odd numbers and even numbers are disjoint
- disjunction
- inclusive or – if one of the inputs is
True
- extrinsic
- not intrinsic
- federated
- Top-down delegation of responsibilities; Has a single point of failure at the top.
- PID controller
- control loop feedback mechanism
- e.g., curise control
- 99%ile
- Abbreviation of percentile
- EBNF
- Extended Backus-Naur Form
- Useful for defining the syntax of a programming language
- EBNF terminal
- a token/word/chunk in EBNF
- Alpha
- Αα
- Beta
- Ββ
- Gamma
- Γγ
- Delta
- Δδ
- Commonly denotes ‘difference’
- Epsilon
- Εε
- Used in Greedy-Epsilon algo for multi-armed bandit problems
- Error margin in floating point comparisons.
- Zeta
- Ζζ
- Eta
- Ηη
- Theta
- Θθ
- Iota
- Ιι
- Kappa
- Κκ
- Lambda
- Λλ
- Mu
- Μμ
- Nu
- Νν
- Xi
- Ξξ
- Omicron
- Οο
- Pi
- Ππ
- Rho
- Ρρ
- Sigma
- Σσς
- Tau
- Ττ
- Upsilon
- Υυ
- Phi
- Φφ
- Chi
- Χχ
- Psi
- Ψψ
- Union
- ∪
- Aleph
- ℵ
- Symbol for cardinal numbers. ℵ is pronounced as Aleph-null.
- Empty Set
- ∅
- Such That
- Commonly represented as a colon,
:
- Example,
D={x^2|x ∈ N, x >=1, x <= 4}
. This reads D is the set of all x^2 SUCH THAT: 1) x is a natural number; 2) x is greater or equal to 1; 3) x is less than or equal to 4. - Intersection
- ∩
- Subset
- ⊂ or ⊆
- e.g., if A = {1,4,9} and B = {1,4}, then B ⊂ A (B is a subset of A).
- Belongs To
- ∈
- ∉ Means “not belong”.
- To say that 1 belongs to S, we write 1 ∈ S.
- e.g., if A = {1,4,9} and e = 4, then we say e∈A, meaning “e belongs to A”. However, one would not say e⊂A – e is a single element, not a set. Similarly, if B = {1,4}, one would not say B∈A or “B belongs to A”, as B is a set not a single element.
- Complements
- Difference between two sets
- Relative Complement
- A\B means objects that belong to A and not to B. i.e., {1,2,3}{3} == {1,2}
- Omega
- Ωω
- P(A|B)
- The likelihood of event A occurring given that B is true.
- P(A^C)
- The probability that A doesn’t happen
- Precision Recall
- Precision = probability that some retreived doc is relevant; Recall = probability that some relevant doc was retreived.
- Narrow Integration Tests
- exercise only that portion of the code in a service that talks to a separate service; uses test doubles of those services, either in process or remote; thus consist of many narrowly scoped tests, often no larger in scope than a unit test (and usually run with the same test framework that’s used for unit tests)
- Broad Integration Tests
- require live versions of all services, requiring substantial test environment and networ access; exercise code paths through all services, not just code responsible for interactions
- Balanced Binary Search Tree
- For example, red-black tree or AVL tree.
- Natural Numbers
- ℕ
- double-struck N
- Cardinal numbers,
- Complex Numbers
- ℂ
- double-struck C
Related
Tag: Books
Managing Humans
I finished Managing Humans from Michael Lopp today. I knew of Lopp through “Rands Leadership”, the Slack community he created. I enjoyed learning from the people in the community and thought I’d pick up his book. I’m not sure what I expected going into the book. I think I expected tips and rules that I can follow. After trying to summarize the book, I think what I wanted and got was a set of mental models that I am happy to add to my toolbox.
Tag: Management
Managing Humans
I finished Managing Humans from Michael Lopp today. I knew of Lopp through “Rands Leadership”, the Slack community he created. I enjoyed learning from the people in the community and thought I’d pick up his book. I’m not sure what I expected going into the book. I think I expected tips and rules that I can follow. After trying to summarize the book, I think what I wanted and got was a set of mental models that I am happy to add to my toolbox.
Tag: TDD
Gomock Tutorial
I’ve been using mock/Gomock
to write tests in my personal project. When you’re building something in a new
language, it is hard to prioritize learning every tool in your toolchain. For me, I’ve been writing custom and
suboptimal code for Gomock because of a nifty but undocumented API call .Do
.
In many cases, I wan to match subsets of a complex object while ignoring irrelevant parts. e.g., verify a function is
invoked with a list of User
objects, but only verifying the email addresses. To do that in a generic
way, I wrote a custom Matcher API that uses text/template
to describe what parts of the object to match. Thus, my
mock-and-verify code looks like:
Tag: Java
Unit tests and system clock
It took me way to long to learn this. Your code (and their unit tests) should inject the system clock as a dependency.
An example, let’s say you have a service that writes a record to the database with the system clock.
public void save(String userName) {
long currentTimeMs = System.currentTimeMillis();
User user = User.builder()
.name(userName)
.updateTimeMs(currentTimeMs);
database.save(user);
}
How would you test this? You can inject a mock database instance and use it to verify that it got a User object. Great! You can verify the username is as expected. How do you verify that tricky business rule that updateTimeMS is the “current time”?
Tag: Engineering Culture
Why We Code Review
Code review can be an important part of a team’s culture, so it is worth thinking about. If you asked me about code review two years ago, I would’ve said its key to maintaining code quality and mentoring less experienced programmers. Now I know better. Code review is much about making the code better as it is about making the team happier. How the team runs code reviews should be based on the team’s values and culture.