Tag: AI

December 31, 2024

Generative AI 2024 Retrospective

2024 witnessed a parade of increasingly more powerful AI model releases, culminating in OpenAI’s groundbreaking “o3”. While I’m certain these advancements really are significant, I don’t think they have translated to noticeable improvements for the average user.

What people are paying attention to, though, are the really amazing product features that are coming out of Anthropic and Google. I’ve heard a lot about people building software Claude’s Artifacts. And Google’s Deep Research product, released a few days ago, has just completely changed search.

August 20, 2024

Running Your Own LLM UI

This is a review of Open WebUI, an extensible and user-friendly self-hosted WebUI for LLMs.

I recently decided to run my own UI layer for LLMs, as I have some exciting ideas. For those who are not familiar with chatbots, their architecture basically looks like this:

While I’m pretty good at customizing the middleware, I’m not as skilled at customizing the UI layer.

I tried writing my own UI at first, but I quickly gave up when I realized it was beyond my skills as a frontend developer. The next thing I did was look into open-source solutions. There were a lot of choices, but I narrowed them down to these three:

January 2, 2023

OpenAI and ChatGPT

I had two weeks off at the end of 2022. Telesign closed its operations on the last week of 2022 to give employees well-deserved time off for a year of hardwork and I took another week off in addition to that. Like many technologists, I became captivated by OpenAI’s release of ChatGPT in 2022, and I spent a lot of the last two weeks exploring what OpenAI has to offer.

ChatGPT is a chatbot developed by OpenAI. It has been widely recognized for its impressive capabilities, such as imitating human writing, transforming plain English into code, and making glaringly stupid mistakes. The underlying technology of ChatGPT is a machine learning model known as GPT-3. This model is designed to predict how a human might continue a previous piece of text. For example, given what I’ve written so far, GPT-3 predicted what the rest of this blog post will look like:

Tag: Programming

December 7, 2023

My thoughts on AI in 2023

OpenAI took the world by storm in 2023 with the relase of ChatGPT. Other companies quickly followed suit, introducing their own competitors to ChatGPT. In this update, I want to write down my thoughts on some of the biggest players in the field of generative AI.

When I use the term “generative AI,” I am referring to a computer program that can generate text, images, or other types of content. These programs can understand natural language and follow complex instructions. One notable example of a generative AI is OpenAI’s “GPT-4,” commonly known as ChatGPT. Another well-known generative AI is Midjourney, which specializes in generating images from text descriptions. In this update, I will focus on AI with text output, as that is the area I have the most experience with.

February 11, 2023

Use GPT-3 To Build A Code Translator

Once you know a programming language well, the process for learning a new language is not very hard. It is just time consuming. You need to read the documentation for basic syntax and flow control, get familiar with its idioms, memorize core parts of the standard libraries, and learn its tool chains. What can we do to speed up the learning process? One thing we can do is provide great examples in the documentation. Can we do better? What if you have working examples for every problem you encountered? What if you can describe your intents in a familiar language and see how it should look in a new language? As it turns out, GPT-3 is really good at this task.

January 2, 2023

OpenAI and ChatGPT

May 3, 2021

Unit Tests vs Integration Tests

I used to favor automated integration testing. Now, I find myself walking away from it. I no longer find it worth the cost of setting up and maintaining such tests. I now rely mostly on unit testing.

First, I need to define what I meant by those terms. In context of this post, unit tests

In Wikipedia

unit test	integration test
automated, not always	not specified
ranging from entire ‘module’ to an individual function	modules tested as a group
depends on execution conditions and testing procedures	depends on unit test/implies modules are already unit tested

From Martin Fowler

October 16, 2020

Dell XPS-13 - Developer Edition

This Feburary, I ordered a Dell XPS-13 Developer Edition. The Developer Edition is a line of Dell laptops that ships with certified Linux OS. It has been my programming powerhouse for the last eight months.

My first choice for a laptop was not Linux. Windows and macOS are simply more practical. Games and MS Office just works on Windows. Software support on macOS is generally good (except games), but it beats Windows with its underlying BSD architecture. There is better hardware support simply because more people use it. However, I ultimately ended up with a Dell and Linux and it worked out great.

Pages

Vocabulary

bind: Sometimes a synonym for “map”
conflate: combine
disjoint: separate, e.g., odd numbers and even numbers are disjoint
disjunction: inclusive or – if one of the inputs is True
extrinsic: not intrinsic
federated: Top-down delegation of responsibilities; Has a single point of failure at the top.
PID controller: control loop feedback mechanism; e.g., curise control
99%ile: Abbreviation of percentile
EBNF: Extended Backus-Naur Form; Useful for defining the syntax of a programming language
EBNF terminal: a token/word/chunk in EBNF
Alpha: Αα
Beta: Ββ
Gamma: Γγ
Delta: Δδ; Commonly denotes ‘difference’
Epsilon: Εε; Used in Greedy-Epsilon algo for multi-armed bandit problems; Error margin in floating point comparisons.
Zeta: Ζζ
Eta: Ηη
Theta: Θθ
Iota: Ιι
Kappa: Κκ
Lambda: Λλ
Mu: Μμ
Nu: Νν
Xi: Ξξ
Omicron: Οο
Pi: Ππ
Rho: Ρρ
Sigma: Σσς
Tau: Ττ
Upsilon: Υυ
Phi: Φφ
Chi: Χχ
Psi: Ψψ
Union: ∪
Aleph: ℵ; Symbol for cardinal numbers. ℵ is pronounced as Aleph-null.
Empty Set: ∅
Such That: Commonly represented as a colon, :; Example, D={x^2|x ∈ N, x >=1, x <= 4}. This reads D is the set of all x^2 SUCH THAT: 1) x is a natural number; 2) x is greater or equal to 1; 3) x is less than or equal to 4.
Intersection: ∩
Subset: ⊂ or ⊆; e.g., if A = {1,4,9} and B = {1,4}, then B ⊂ A (B is a subset of A).
Belongs To: ∈; ∉ Means “not belong”.; To say that 1 belongs to S, we write 1 ∈ S.; e.g., if A = {1,4,9} and e = 4, then we say e∈A, meaning “e belongs to A”. However, one would not say e⊂A – e is a single element, not a set. Similarly, if B = {1,4}, one would not say B∈A or “B belongs to A”, as B is a set not a single element.
Complements: Difference between two sets
Relative Complement: A\B means objects that belong to A and not to B. i.e., {1,2,3}{3} == {1,2}
Omega: Ωω
P(A|B): The likelihood of event A occurring given that B is true.
P(A^C): The probability that A doesn’t happen
Precision Recall: Precision = probability that some retreived doc is relevant; Recall = probability that some relevant doc was retreived.
Narrow Integration Tests: exercise only that portion of the code in a service that talks to a separate service; uses test doubles of those services, either in process or remote; thus consist of many narrowly scoped tests, often no larger in scope than a unit test (and usually run with the same test framework that’s used for unit tests)
Broad Integration Tests: require live versions of all services, requiring substantial test environment and networ access; exercise code paths through all services, not just code responsible for interactions
Balanced Binary Search Tree: For example, red-black tree or AVL tree.
Natural Numbers: ℕ; double-struck N; Cardinal numbers,
Complex Numbers: ℂ; double-struck C

https://en.wikipedia.org/wiki/Blackboard_bold

August 1, 2020

Go io/fs Design (Part I)

As usual, LWN has a good write up on what’s going on in the Go community. This week’s discussion in on the new io/fs package. The Go team decided to use a Reddit thread to host the conversation about this draft design. LWN points out that posters raised the following concerns:

We added status logging by wrapping http.ResponseWriter, and now HTTP/2 push doesn’t work anymore, because our wrapper hides the Push method from the handlers downstream. / It becomes infeasible to use the decorator pattern more
Doing it “generically” involves a combinatorial explosion of optional interfaces

Ultimately, Russ Cox admits, “It’s true - there’s definitely a tension here between extensions and wrappers. I haven’t seen any perfect solutions for that.”

May 19, 2020

Unit tests and system clock

It took me way to long to learn this. Your code (and their unit tests) should inject the system clock as a dependency.

An example, let’s say you have a service that writes a record to the database with the system clock.

public void save(String userName) {
	long currentTimeMs = System.currentTimeMillis();
	User user = User.builder()
	    .name(userName)
		.updateTimeMs(currentTimeMs);
	database.save(user);
}

How would you test this? You can inject a mock database instance and use it to verify that it got a User object. Great! You can verify the username is as expected. How do you verify that tricky business rule that updateTimeMS is the “current time”?

May 17, 2020

Go Project Organization

Here’s a rough layout of how I organize my Go project. Some parts are situational and some parts are essential. I’ll go over both in this blog.

A rough layout:

+ basedir
   +-- go.mod (module jcheng.org)
   +-- hello (empty)
         +-- log/
         +-- utils/
         +-- config/
         +-- models/
         +-- repositories/
         +-- services/
         +-- cmd/
              +-- hello_app/
                     +--/cmd/
                          +-- speak/
                          +-- email/
                          +-- sms/

The basedir

Situational.

February 27, 2020

Dependencies

Some past self version of me is saying, every class and function should be explicit about their dependencies, so that they are easily testable. John0 would say, “If you have a service that talks to a database, the database client should be an explicit dependency specified in the constructor. This makes the code easily testable.”

There is another version of myself from 10 minutes ago arguing it’s foolish to be explicit about everything. He’d point to this piece of code he’s just looked at:

Tag: Leadership

July 9, 2022

Book Review: Turn The Ship Around

My notes from Turn the Ship Around! A True Story of Turning Followers into Leaders.

Early in the book, Marquet tells a story of having failed to empower officers under his command. The story of being unable to “empower” stuck with me. It reminded me of experiences, earlier in my career, of failing to empower people reporting into me.
On empowerment, Marquet said “Empowerment is not how I want to be managed.” and “Empowering others feel manipulative. I believe people are empowered by nature.” This is often how I felt about empowerment.
- If not empowerment, then what? While he doesn’t mention this specifically, the way I interpreted his leadership style throughout the book is that he applied the Situational Leadership style.
Marquet tells a story of being reprimanded after coming up with a brilliant ruse to sink an enemy submarine (during practice). The moral of that story is that a great plan which is too complex for others to execute is, in fact, a bad plan.
- tldr; KISS
His story of being assigned to Santa Fe, of feeling dejected and finding the motivation to carry through is very inspiring. It is also a good example of how to be a good leader – his commander applied great leadership skills to help Marquet through it.
- On being told he’d have the full support of his command officer to succeed, then being told… “But, I don’t think it’s a good idea if you ask for A, B, and C” gave me a new perspective on “support.”
- One view of support is that “I trust you to do this” and “I think you are the best person for this job.”
- There are good ways (better than I have been doing) to communicate “these are the limitations to what we can provide to you.”
Watch out for signs of low morale! People avoiding mistakes, meeting the minimum requirements, and “do whatever they tell me to do.”
In chapter 13, Marquet tells a story of giving more responsibility to department heads too soon. In chapter 17, he talks about the importance of training in order for delegation to succeed. It feels like these two points should’ve been tied together more closely. Though I’m really just nitpicking now.

I suspect, if you are in the tech industry where there is already a lot of talk about autonomy and ownership, you like already understand the theme of this book. If you believe that people need to be led or that a great personality is needed to inspire others, then you should check this out; It will offer you a useful counter perspective.

February 7, 2020

Delegation

One thing about job hopping is that you get to experience new perspectives on how people work. It forces you to reconsider what’s obvious. Take delegation for example. I used to think anyone with a bit of experience can do it effectively. It’s just about decomposing a system and assigning components to different people, right? It turns out effective delegation is much more nuanced.

Situational Leadership

Situation Leadership is a model that ascribes different delegation styles based on the competency (skill) and commitment (motivation) of each team member. Assuming motivation is not a factor, skill is the sole determinant on how much you should direct an engineer. A highly competent engineer requires very little direction. An inexperienced engineer requires specific and concrete directions.

Tag: Untagged

June 19, 2022

Book review - Bitch: On the Female of the Species

Bitch is a book from zoologist Lucy Cooke on correcting misunderstandings and cultural biases in our biological science. She explains, via animal studies, how existing notions of sex, male/female roles, and “what is natural” have been incomplete, if not incorrect.

In Bitch, Lucy goes through the behaviors of lemurs, meerkats, hyenas, moles, orcas, elephants, bonobos, termites, birds and fish to challenge conventional wisdom on nature and the roles of male and females. This alone makes for an engrossing read. I cannot help but to be delighted by learning brand new things about nature and animals. It brings me back to being a child again. What makes things better is her amusing way with words. (I am jealous.) Some choice phrases from the book:

Pages

Pithy Sayings

Chaos theory: When the present determines the future, but the approximate present does not approximately determine the future. - Edward Lorenz
Code: If you can’t write it down in English, you can’t code it - Peter Halpern, via Jon Bentley
Prioritize: There is no such thing as two equally urgent projects, only priorities that haven’t been made clear yet. - Unknown
Regular expression: Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. - Jamie Zawinski
Science: When people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together. - Isaac Asimov

May 17, 2022

Optimisim

This is a story from Neil DeGrasse Tyson’s StarTalk podcast. He talked about being on Brian Cox’s show and discussing the futures of space travel. Neil was saying that chemical rocket engines are not going to cut it and we need to look into things like wormholes and warp drives. Then Brian cuts in and explains that wormholes are fundamentally unstable and they won’t work.

Brian was absolutely correct, but that was not the point. The point was about what the audience said. Cox’s audience said to them, “That’s why the Americans discover everything! They are always so optimistic!” (Cox is a British phycisit and Neil is an American physicist).

September 14, 2021

Staff Plus Live 2021

Staff Plus Live 2021 was a virtual conference held by LeadDev.com on Sept 14, 2021. I believe this is their first conference aimed at Staff+ Engineers. Loosely defined, Staff+ are high-impacting engineers whose success is felt across the company. In other words, these are engineers who creates high leverage. I was able to attend this year’s conference and learned a lot from it. I started this post to write down what resonated with me before I forget about them.

August 29, 2021

Post of the Week 2021-08-29

Just the act of measuring something can change outcomes. I love the reference to Jespen.

https://danluu.com/why-benchmark/

July 25, 2021

Post of the Week 2021-07-25

Back from a long vacation. This week, we have folk wisdom on visual programming

https://drossbucket.com/2021/06/30/hacker-news-folk-wisdom-on-visual-programming/

June 3, 2021

Post of the Week 2021-06-03

Humans and incentives, and why some metrics (or objectives and key results) should not be public.

http://rachelbythebay.com/w/2021/06/01/count/

May 29, 2021

Post of the Week 2021-05-27

Systemd, the good parts

https://lobste.rs/s/po98o2/systemd_good_parts

May 16, 2021

Post of the Week 2021-05-13

My thoughts today led me to reading about nuclear wastes. It turns out that what we call “nuclear waste” is a pretty vague term.

A light-water reactor generates different kind of “waste” from a molten salt reactor. A traditional light-water reactor doesn’t consume fuel efficiently. It leaves behing partially consumed fuel that is not economically attractive to refine for further use. These partially spent rods continue to generate heat and are sometimes stored in cooling pools. Wastes from light-water reactors contain plutonium, which can be used to create atomic weapons.

May 4, 2021

Post of the Week 2021-04-06

Grammar visualizer

https://dundalek.com/grammkit/

Which leads to a JS parser generator

https://pegjs.org/online

And also leads to a JS parser generator with a visual debugger

https://ohmlang.github.io/editor/

This also prompts my interest into a graphviz/dot parser

https://github.com/awalterschulze/gographviz

Which uses this Go parser generator

https://github.com/goccmack/gocc

This exploration also led me to this Go parser generator

https://github.com/alecthomas/participle

April 1, 2021

Posts Of The Week 2021-04-01

I spent a couple of hours evaluating 3rd party libraries. What have I learned? For me, there’s one clear winner in a small field of candidates.

Presently, these are the top hits for “golang gauge counter timer”.

The first result is go-kit. Go-kit isn’t a metrics library. Rather, it bills itself as a “framework for building microservices.” Its metrics package is simply a set of interfaces. You then refrence one of the many sub-packages with concrete implementations. As a consequence, it’s go.mod file is pretty huge.

March 24, 2021

Posts Of The Week 2021-03-25

Go does not allow cyclic imports. A solution is to create a “shared” package to hold interfaces that related packages all reference. This, for some reason, reminds of me join tables in SQL.

Here is an example of a typical Go project. Packages toward the bottom, e.g., “common/persistence”, allow different packages to work with each other without introducing cyclic dependencies. For this project, “log” can be referenced by “config”, but cannot use “config” to conifgure itself.

March 8, 2021

Posts Of The Week 2021-03-11

Oldie but goodie. Go concurrency patterns

https://drive.google.com/file/d/1nPdvhB0PutEJzdCq5ms6UI58dp50fcAN/view

March 7, 2021

Posts Of The Week 2021-03-04

Two book recommendations

Designing Data Intensive Applications: Don’t let the name fool you. The knowledge in this book applies to more than just data processing applications but to distributed systems in general. It is a great book for all software architects.

Making Software: What Really Works, and Why We Believe It: A meta-analysis of various theories on software development processes. Is TDD effective? Is Agile just a hype or is it just misused? Which code metrics are actually useful? There are evidence-based answers to many of these questions.

February 22, 2021

Posts Of The Week 2021-02-18

The Peseverance rover lands on Mars. In this month, the UAE, PRC, and US all sent scientific instruments to Mars.

https://www.nytimes.com/2021/02/18/science/nasa-peseverance-mars-landing.html

January 31, 2021

Posts Of The Week 2021-01-29

TextRank identifies connections between various entities in a text, and implements the concept of recommendation. A text unit recommends other related text units, and the strength of the recommendation is recursively computed based on the importance of the units making the recommendation. In the process of identifying important sentences in a text, a sentence recommends another sentence that addresses similar concepts as being useful for the overall understanding of the text

January 22, 2021

Posts Of The Week 2021-01-22

This blog is such a great example of why it is difficult to creat great software. So often you have to make impossible choices between security and backward-compatibility.

Today’s Go security release fixes an issue involving PATH lookups in untrusted directories that can lead to remote execution during the go get command. We expect people to have questions about what exactly this means and whether they might have issues in their own programs. This post details the bug, the fixes we have applied, how to decide whether your own programs are vulnerable to similar problems, and what you can do if they are.

January 6, 2021

Posts Of The Week 2021-01-08

Sascha Chua is a great resource on Emacs-y things

https://sachachua.com/blog/

January 5, 2021

Posts Of The Week 2020-12-31

Maybe Emacs doesn’t need to be a fusion reactor. I only hope it continues to generate energy for many years to come.
It just needs volunteers to keep the fire going.

https://www.murilopereira.com/how-to-open-a-file-in-emacs/

December 22, 2020

Posts Of The Week 2020-12-18

Replicating a database can make our applications faster and increase our tolerance to failures, but there are a lot of different options available and each one comes with a price tag. It’s hard to make the right choice if we do not understand how the tools we are using work, and what are the guarantees they provide (or, more importantly, do not provide), and that’s what I want to explore here.

December 13, 2020

Posts Of The Week 2020-12-11

Concept: Poison pill

One strategy is a poison pill: a special message on the queue that signals the consumer of that message to end its work. To shut down the squarer, since its input messages are merely integers, we would have to choose a magic poison integer (everyone knows the square of 0 is 0 right? no one will need to ask for the square of 0…) or use null (don’t use null). Instead, we might change the type of elements on the requests queue to an ADT…

November 24, 2020

Posts Of The Week 2020-11-27

Meet GPT-3. It Has Learned to Code (and Blog and Argue).

For many artificial intelligence researchers, it is an unexpected step toward machines that can understand the vagaries of human language — and perhaps even tackle other human skills.

November 12, 2020

Posts of the Week 11/06/20

https://rootsofprogress.org/immunization-from-inoculation-to-rna-vaccines

When you get your covid shot (probably in 2021), take a moment to think back on the 300 years of progress that got us to this point.

https://github.com/wbolster/emacs-python-black

This is an Emacs package to make it easy to reformat Python code using black, the uncompromising Python code formatter.

October 22, 2020

Posts of the Week 10/08/20

What are your favorite CLI apps?

I’m looking for CLI utilities that are definitely not part of the POSIX required or optional utilities, and more coloquiallly not considered to be standard BSD or *nix fare.

The Penderwicks

My daughters couldn’t stop lauging during storytime. They actually enjoy bedtime now. Birdsall is an excellent writer.

Just write the praser

This is a whirlwind tour of writing parsers by hand. Why would you want to do that, when tools like Yacc exist to do it for you?

October 20, 2020

2020 Emacs User Survey

I recently participated in 2020 Emacs User Survey. One of the questions asked is “When you were first learning Emacs, what did you find difficult to learn?” The obvious answer is keyboard shortcuts, e.g., instead of CTRL-S for save, it is CTRL-X CTRL-S. Instead, CTRL-S performs find, which is usually mapped to CTRL-F, and so on and so forth.

Why Emacs is hard for new users

There were other problems too. I didn’t put them all down in the survey. I’ll jot them down here as they come to mind.

October 19, 2020

Webcam on Linux (Logitech Razer Kiyo)

I recently got a Logitec Razer Kiyo for my Zoom meetings. Currently, I need to use it on on Linux Laptop, and it works rather fine. I was skeptical that it would “just work” on Linux, but it did! There is also a software package named v4l-utils that allows you to configure the zoom (crop) level of your video, which is nice for cutting out undesirable background.

apt-get install v4l-utils

You can get a listing of attached cameras with

October 16, 2020

Posts of the Week 10/23/20

How to hire for your organization

A three-part series from 9/21/20 to 10/5/20 on how to hire for your organization.

The Tempral Workflow Framework

Let’s look at a use case. A customer signs up for an application with a trial period. After the period, if the customer has not cancelled, he should be charged once a month for the renewal. The customer has to be notified by email about the charges and should be able to cancel the subscription at any time.

October 16, 2020

Dell XPS-13 - Developer Edition

October 8, 2020

Posts of the Week 10/08/20

The Ultimate Guide to GPT3 from Twilio

A step-by-step walk through of what it is like to the the GUI to GPT3 from OpenAI

A Columnist Makes Sense of Wall Stree Like None Other (See Footnote)

Mr. Levine wasn’t always a darling of business media and finance Twitter. (The best measure of his audience’s devotion may not be his 112,000 Twitter followers, but rather the 3,000 that follow @MattLevineBot, a fan account describing itself as a bot that mimics his writing style.) He began his post-collegiate career as a Latin teacher, then worked as a lawyer at Wachtell, Lipton, Rosen & Katz before advancing to Goldman. Despite having made more money at white-shoe law and Wall Street firms than he does as a writer, Mr. Levine says he is happier now. He is doing exactly what he has long wanted to do. This is the story of his ascension. It begins with an escalator.

September 21, 2020

Creating New Abstractions

Software engineers find it natural to talk about abstractions. We have ideas such as Decorators, Model-View-Controller, and Message Queues. These abstractions allow software engineers to talk to each other using our own rich and succint language. Abstractions, however, is not unique to engineers. In my role as a parent and an American citizen, I am constantly confronted with new abstract ideas arising out of life.

The abstractions I am referring to are new words and ideas coming out of our shared culture. For example, words that do not exist twenty years ago: Me-Too and BLM as well as a redefinition of words like gender and socialism.

September 2, 2020

Post(s) of the Week Sept 2020

The Empathy Gap from Effectiviology.com

For example, if a person is currently feeling calm, the empathy gap can cause them to struggle to predict how they will act when they’re angry. Similarly, if a person who is on a diet is currently full, the empathy gap can cause them to struggle to assess how well they will be able to handle the temptation to eat when they’re hungry.

Tag: Go

October 17, 2021

Interpreter in Go - 4

In Chapter 1.3 of [Ball’s Writing an Interpreter in Go][1], we encounter one design decision of his Monkey programming language. Here, the lexer has a NextToken() method that looks like this:

func (l *Lexer) NextToken() token.Token {
    var tok token.Token

    switch l.ch {
// [...]
    default:
        if isLetter(l.ch) {
            tok.Literal = l.readIdentifier()
            return tok
        } else {
            tok = newToken(token.ILLEGAL, l.ch)
        }
    }
// [...]
}

This means the lexer itself does not do backtracking. The meaning of a character at any point cannot be ambiguous. You cannot say, for example, that ‘+’ is the ‘plus’ token unless it is in the middle of a variable name. I don’t know many programming languages that support such behavior – so it is probably an acceptable design decision. You know what they say, “Keep it Simple, Smartypants”. I’m not sure if there are other notable constraints introduced by the design at this point, but it is something that tickles my overly analytical brain.

August 22, 2021

Interpreter in Go - 3

A lexer takes the source code, a sequence of characters, and group them into tokens. e.g., it makes the first decision on how to process the strings 100-10, -100-10, and -100--100 into groups. I’m going to call this grouping “tokenization” even though I may be misusing the term.

Tokenizing source code is hard. How should -100--100 be tokenized? Should it be a literal -100 followed by the minus token, followed by another -100?

July 29, 2021

Interpreter in Go - 2

Writing an Interpreter In Go by Thorsten Ball will be my personal introduction to writing an interpreter. I’ve never taken a comp sci class before, so I know nothing about compilers. On a lark, I decided to explore this area now, nearly 20 years after I started to learn computer programming.

If you are interested in this book as well, you might might the AST Explorer a useful companion.

I was told as some point in the past, that compilation can be broken down into four stages:

July 27, 2021

Interpreter in Go - 1

It happened. At the recommendation of https://twitter.com/dgryski, I bought Writing an Interpreter In Go. This will be my next hobby project. It’ll be interesting to see if I ever finish it.

April 18, 2021

Posts Of The Week 2021-04-15

Lots of small things today.

History

“Those who cannot remember the past are condemned to repeat it”. I love reading about how software came to be the way they are today.

Before we can talk about where generics are going, we first have to talk about where they are, and how they got there.

https://cr.openjdk.java.net/~briangoetz/erasure.html

Go Errors

Error handling is still jacked in Go 1.16. That is, the formatting change is still not present. Why is this a problem? There are two use cases for errors. Error as values, which can be inspected programatically, and error printing, which is not meant for programmatic consumption.

April 3, 2021

Workflow Orchestration - Part 3 (How do I use this?)

In this part of the series, we’ll write some hands-on Temporal code and run it. Let’s start with our requirements:

You need to transmit a data packet. You can choose from multiple Route Providers to do this. Transmission takes time – you will be notified on a callback URL when the packet is delivered. Delivery may fail – either because the acknowledgement was not sent or arrived late (because Internet). You should try the next provider when one fails.

April 1, 2021

Posts Of The Week 2021-04-01

I spent a couple of hours evaluating 3rd party libraries. What have I learned? For me, there’s one clear winner in a small field of candidates.

Presently, these are the top hits for “golang gauge counter timer”.

March 24, 2021

Posts Of The Week 2021-03-25

Go does not allow cyclic imports. A solution is to create a “shared” package to hold interfaces that related packages all reference. This, for some reason, reminds of me join tables in SQL.

March 8, 2021

Posts Of The Week 2021-03-11

Oldie but goodie. Go concurrency patterns

https://drive.google.com/file/d/1nPdvhB0PutEJzdCq5ms6UI58dp50fcAN/view

November 23, 2020

Go: Pointer vs Value

In A Tour of Go, it states “Go has pointers. A pointer holds the memory address of a value.” When you design your data structure in Go, you have to decide between using a pointer or not. There’s no clear rule of thumb for it.

I had been reading the Go source code to AWS’s client library for DynamoDB. For a while, I had been annoying with their API design, which looks like this:

August 1, 2020

Go io/fs Design (Part I)

We added status logging by wrapping http.ResponseWriter, and now HTTP/2 push doesn’t work anymore, because our wrapper hides the Push method from the handlers downstream. / It becomes infeasible to use the decorator pattern more
Doing it “generically” involves a combinatorial explosion of optional interfaces

Ultimately, Russ Cox admits, “It’s true - there’s definitely a tension here between extensions and wrappers. I haven’t seen any perfect solutions for that.”

July 26, 2020

Localstack S3 and Go

I spent too much time Saturday getting the Go S3 SDK to work with LocalStack.. It turns out that if you are using LocalStack, you need to explicitly configure the following properties:

	sess, err := session.NewSession(aws.NewConfig().
		WithEndpoint(endpoint).
		WithRegion(region).
		WithS3ForcePathStyle(true))

The requirements for Endpoint and Region are obvious. If S3ForcePathStyle is not specified, then LocalStack will fail.

	data, err := svc.GetObject(&s3.GetObjectInput{
		Bucket: &bucket,
		Key:    &cfgKey,
	})

What is path-style? In May 2019, Amazon deprecated path-based access model for S3 objects. This means one should no longer use URLs of the form:

May 26, 2020

Gomock Tutorial

I’ve been using mock/Gomock to write tests in my personal project. When you’re building something in a new language, it is hard to prioritize learning every tool in your toolchain. For me, I’ve been writing custom and suboptimal code for Gomock because of a nifty but undocumented API call .Do.

In many cases, I wan to match subsets of a complex object while ignoring irrelevant parts. e.g., verify a function is invoked with a list of User objects, but only verifying the email addresses. To do that in a generic way, I wrote a custom Matcher API that uses text/template to describe what parts of the object to match. Thus, my mock-and-verify code looks like:

May 17, 2020

Go Project Organization

Here’s a rough layout of how I organize my Go project. Some parts are situational and some parts are essential. I’ll go over both in this blog.

A rough layout:

+ basedir
   +-- go.mod (module jcheng.org)
   +-- hello (empty)
         +-- log/
         +-- utils/
         +-- config/
         +-- models/
         +-- repositories/
         +-- services/
         +-- cmd/
              +-- hello_app/
                     +--/cmd/
                          +-- speak/
                          +-- email/
                          +-- sms/

The basedir

Situational.

February 19, 2020

Mocking in Go Part 2

In a previous post, I talked about Gomock. I want to spend a bit more time on it.

As I’ve mentioned, setting up Gomock requires

Download mockgen

go get github.com/golang/mock/mockgen@latest

Edit your go.mod

module jcheng.org

go 1.13

require (
	github.com/golang/mock v1.4.0
)

Add the go:generate incantation to your code

//go:generate mockgen -source $GOFILE -destination mock_$GOFILE -package $GOPACKAGE
package pastself

import (
	"time"
)
...

Use mocks in your test case. In this example, I used a callback function to match on a nested property.

type fnmatcher struct {
	d    func(x interface{}) bool
	desc string
}

func (self *fnmatcher) Matches(x interface{}) bool {
	return self.d(x)
}

func (self *fnmatcher) String() string {
	return self.desc
}

func On(desc string, d func(x interface{}) bool) mock.Matcher {
	return &fnmatcher{d: d, desc: desc}
}

// TestSendOverdueMessages_ok is an example of using matching using a callback function
func TestSendOverdueMessages_ok(t *testing.T) {
	ctrl := mock.NewController(t)
	defer ctrl.Finish()

	mockUserRepo := NewMockUserRepository(ctrl)
	mockMessageRepo := NewMockMessageRepository(ctrl)
	mockMessageSender := NewMockMessageSender(ctrl)
	log, _ := NewBufferLog()
	r := NewPastSelfService(
		mockMessageRepo,
		mockMessageSender,
		mockUserRepo,
		log,
	)

	senderUserDDB_1 := &UserDDB{UserID: "sender1"}
	senderUserDDB_2 := &UserDDB{UserID: "sender2"}
	recipientUserDDB_2 := &UserDDB{
		UserID: "recipient2",
		Profile: UserProfileDDB{
			PreferredEmail: "recipient2@example.com",
		},
	}
	resultSet := &ResultSet{
		Items: []Message{
			{
				ID:         1,
				SenderID:   "sender1",
				Body:       "body1",
				Recipients: []string{"email://recipient1@example.com", "recipient2"},
				Headers: []Header{
					{Name: "x-header-key-1", Value: "value-1"},
					{Name: "x-header-key-2", Value: "value-2"},
				},
			},
			{
				ID:       2,
				SenderID: "sender2",
				Body:     "body2",
			},
		},
	}
	mockUserRepo.EXPECT().GetUser("sender1").Return(senderUserDDB_1, nil)
	mockUserRepo.EXPECT().GetUser("sender2").Return(senderUserDDB_2, nil)
	mockUserRepo.EXPECT().GetUser("recipient2").Return(recipientUserDDB_2, nil)
	mockMessageRepo.EXPECT().FindOverdue(mock.Any(), 0, "").Return(resultSet, nil)

	matchFrom := On("user.UserID==sender1", func(x interface{}) bool {
		if y, ok := x.(*UserDDB); ok {
			return y.UserID == "sender1"
		}
		return false
	})
	matchTo := On("[]Recipient{recipient1@example.com,recipient2@example.com}", func(x interface{}) bool {
		if y, ok := x.([]Recipient); ok {
			return len(y) == 2 &&
				y[0].Email == "recipient1@example.com" &&
				y[1].Email == "recipient2@example.com"
		}
		return false
	})
	matchOutMessage := On("OutboundMessage.Body==Body1", func(x interface{}) bool {
		if y, ok := x.(OutboundMessage); ok {
			return y.Body == "body1"
		}
		return false
	})
	mockMessageSender.EXPECT().Send(matchFrom, matchTo, matchOutMessage).Return(nil).MaxTimes(1)

	r.SendOverdueMessages()
}

One thing nice about GoMock is that it generates statically typed method names for the EXPECT() statements, so the compiler can check that you’re using the correct method names. Neat.

February 19, 2020

Few Annoying Things about Go

As promised, a few thoughts about Go that is annoying.

No Generics

Code generation is tightly coupled with the tool chain. When I need to use code generation, i.e., enums and mocks, the use case can be solved with generics instead.

No lambda syntax

Scala has a lambda syntax to make it easy to work with anonymous function objects

val numbers = Seq(1, 5, 2, 100)
val doubled = numbers.map(
    n => n * 2
  )

Java has a similar syntax

February 14, 2020

Mocking in Go

Today I want to talk a little about the testify and gomock packages and doing mocks in Go.

Mocking is essential when writing code that depends on external services, e.g., microservices, message brokers, and datastores. For many people, this means web applications: My system has business rules and talks to multiple services. How do I test the business rules without setting up all the services locally?

Some people argue that mocks create a false sense of test coverage and introduce blind spots. I think that’s partially true but also too simplistic. In many cases, engineers gain sufficient value from testing “glue code” to make mocking worthwhile. This follows the principle of don’t let perfect be the enemy of good.

February 8, 2020

Why I like Go

I have a few side projects, published and unpublished. By trade, I’ve come to programming as a Java programmer. I started coding directly against the Servlet API and have coded using Enterprise Java Beans, Play Framework, Spring Framework, and even some Scala. Lately, I’ve been coding mainly in Go with some Python.

Go is a really nice programming language. Some people like it for its simplicity. I want to offer a different take on why you should learn and use Go for your own projects.

January 4, 2020

Viper Examples

Viper is a configuration library fo Go. It has been my go to library for configs. Some of the best features of Viper are:

Ability to unmarshal subsets of a configuration file into a Go struct
Ability to override contents of a configuration file using OS environment variables

Here’s a brief example that demonstrates both features:

Assume you have a configuration file at $HOME/server.toml

resolve_dns = true
http_keep_alive = 5

[example_com]
doc_root = "/var/www/example_com"
allow_files = "*.html"
login_required = true
login_lockout_count = 3

[example_org]
doc_root = "/var/www/example_org"
allow_files = "*.html, *.jpg"
login_required = false
login_lockout_count = 5

Then you can utilize the configuration file using this snippet:

November 29, 2019

Go1 13 Errors

For me, Go 1.13 arrived with anticipation of better error handling. Presently, the second Google search result for Go 1.13 error hanlding is an article that refers to the “xerrors” package. One feature of the xerrors package is that it produced errors with a stack trace showing where the error came from. The addition to Go 1.13, however, did not include this particular feature, and left me spending frustrating hours trying to debug the loss of stack trace after switching from xerrors to the standard library.

August 19, 2019

Own Your Data

I previously wrote about owning my own data. An important part of data ownership is backing up your data. I use S3 as my long term data store. It is pretty easy to set this up using Terraform.

S3

Provisioning a S3 bucket is simply a single Terraform resource:

resource "aws_s3_bucket" "repo_archive_log" {
  acl = "log-delivery-write"
  bucket = "example-bucket"
  tags = {
    Name = "example"
    TTL = "persistent"
    ManagedBy = "Terraform"
  }
}

Tag: InterpreterInGo

October 17, 2021

Interpreter in Go - 4

In Chapter 1.3 of [Ball’s Writing an Interpreter in Go][1], we encounter one design decision of his Monkey programming language. Here, the lexer has a NextToken() method that looks like this:

func (l *Lexer) NextToken() token.Token {
    var tok token.Token

    switch l.ch {
// [...]
    default:
        if isLetter(l.ch) {
            tok.Literal = l.readIdentifier()
            return tok
        } else {
            tok = newToken(token.ILLEGAL, l.ch)
        }
    }
// [...]
}

August 22, 2021

Interpreter in Go - 3

Tokenizing source code is hard. How should -100--100 be tokenized? Should it be a literal -100 followed by the minus token, followed by another -100?

July 29, 2021

Interpreter in Go - 2

If you are interested in this book as well, you might might the AST Explorer a useful companion.

I was told as some point in the past, that compilation can be broken down into four stages:

July 27, 2021

Interpreter in Go - 1

It happened. At the recommendation of https://twitter.com/dgryski, I bought Writing an Interpreter In Go. This will be my next hobby project. It’ll be interesting to see if I ever finish it.

Tag: Tags

April 18, 2021

Posts Of The Week 2021-04-15

Lots of small things today.

History

“Those who cannot remember the past are condemned to repeat it”. I love reading about how software came to be the way they are today.

Before we can talk about where generics are going, we first have to talk about where they are, and how they got there.

https://cr.openjdk.java.net/~briangoetz/erasure.html

Go Errors

Tag: Orchestration

April 3, 2021

Workflow Orchestration - Part 3 (How do I use this?)

In this part of the series, we’ll write some hands-on Temporal code and run it. Let’s start with our requirements:

You need to transmit a data packet. You can choose from multiple Route Providers to do this. Transmission takes time – you will be notified on a callback URL when the packet is delivered. Delivery may fail – either because the acknowledgement was not sent or arrived late (because Internet). You should try the next provider when one fails.

March 7, 2021

Workflow Orchestration - Part 2 (Why do I care?)

An increasingly distributed and fragile world

Workflow platforms are important because software engineers are increasingly adopting distributed systems in their architecture. There are two reasons for this change: 1) Users are demanding more frequent releases, feature teams, better peformance, and higher availability; 2) Providers are increasingly moving away from “use our library” (Spring Framework) to “use our APIs” (AWS, Azure, and GCP).

This change is undoubtedly a good thing, however, it also introduces new problems. It is much harder to trace program execution in a distributed system. A business process can span multiple services, created by multiple teams, in a variety of programming languages. There are more ways for things to fail, less consistency in code quality and documentation, and it’s harder to understand what happens when things go wrong.

February 28, 2021

Workflow Orchestration 1 - What is a workflow?

Introduction

It is hard to describe what a Workflow Platform is. It is both familiar and exotic. There are aspects of the problem space we all know well: Retries, eventual consistency, message processing semantics, visibility, heartbeating, and distributed processing to name a few. Yet, when they’re all put together in a pretty package with a bow tied on top, it becomes something almost magical. It feels like seeing the iPhone for the first time: Of course you want a touch screen on a cellphone and mobile internet access. Similarly, a workflow platform feels like the only natural way to solve problems. Once you learn it, anything else feels as clunky as using a feature phone.

Tag: Python

November 12, 2020

Posts of the Week 11/06/20

https://rootsofprogress.org/immunization-from-inoculation-to-rna-vaccines

When you get your covid shot (probably in 2021), take a moment to think back on the 300 years of progress that got us to this point.

https://github.com/wbolster/emacs-python-black

This is an Emacs package to make it easy to reformat Python code using black, the uncompromising Python code formatter.

May 19, 2020

Unit tests and system clock

It took me way to long to learn this. Your code (and their unit tests) should inject the system clock as a dependency.

An example, let’s say you have a service that writes a record to the database with the system clock.

public void save(String userName) {
	long currentTimeMs = System.currentTimeMillis();
	User user = User.builder()
	    .name(userName)
		.updateTimeMs(currentTimeMs);
	database.save(user);
}

Tag: Docker

October 30, 2020

Docker With Custom DNS Server

By configuring your Docker containers to talk to a custom DNS server, you gain more control over how your container look up other services – database, other microservices, etc.,

It turns out Docker networking isn’t completely straight forward. On Linux you can:

Run a DNS server locally, either dnsmasq or devdns
Run your container with --dns 172.17.0.1, the magic IP where your host machine is located
Configure dnsmasq from step 1 via /etc/hosts

Sadly, none of it was easy. The entire thing probably took me three hours.

July 12, 2020

Running Private Docker Registry

I still find it hard to believe how easy it is to run your own infrastructure in the cloud.

Running a Docker registry is as simple as adding a few lines of code to your Terraform configuration.

	resource "aws_ecr_repository" "foo" {
	  name                 = "bar"
	  image_tag_mutability = "MUTABLE"

	  image_scanning_configuration {
		scan_on_push = true
	  }
	}

When deployed, this create an registry where you can manage multiple Docker repositories. You can upload a Docker image to be used in your private ECS cluster:

Tag: AWS

October 15, 2020

AWS Step Functions

I briefly worked on a workflow system at $past_job, implemented using AWS Step Functions. My experience was pretty terrible. I wasn’t sure which technical requirements led the team to this system. Some people said we needed a “system that is configuration driven and not code driven” and some people said “we needed something that scales.” Whatever the reason was, making improvements to this system was a pain in the ass, with AWS Step Functions itself being somewhat responsible.

September 8, 2020

I want to love DynamoDB

I want to love DynamoDB. I love that it just scales (disk usage and processing power). I love that it is tightly integrated with AWS’s IAM model, so I don’t have to deal with user/role/permissions management.

But DynamoDB does some weird things by design. For example, only the primary key can be unique. If you want a table with multiple unique attributes, for example, a Users table where both the user_name and email are unique, you’ll have to do weird things like this.

July 26, 2020

Localstack S3 and Go

I spent too much time Saturday getting the Go S3 SDK to work with LocalStack.. It turns out that if you are using LocalStack, you need to explicitly configure the following properties:

	sess, err := session.NewSession(aws.NewConfig().
		WithEndpoint(endpoint).
		WithRegion(region).
		WithS3ForcePathStyle(true))

The requirements for Endpoint and Region are obvious. If S3ForcePathStyle is not specified, then LocalStack will fail.

	data, err := svc.GetObject(&s3.GetObjectInput{
		Bucket: &bucket,
		Key:    &cfgKey,
	})

What is path-style? In May 2019, Amazon deprecated path-based access model for S3 objects. This means one should no longer use URLs of the form:

August 19, 2019

Own Your Data

I previously wrote about owning my own data. An important part of data ownership is backing up your data. I use S3 as my long term data store. It is pretty easy to set this up using Terraform.

S3

Provisioning a S3 bucket is simply a single Terraform resource:

resource "aws_s3_bucket" "repo_archive_log" {
  acl = "log-delivery-write"
  bucket = "example-bucket"
  tags = {
    Name = "example"
    TTL = "persistent"
    ManagedBy = "Terraform"
  }
}

Tag: Math

September 21, 2020

Deriving Average Without Totals

Even an old dog like me learns something new every day.

Let’s say you need a program that receives a sequence of numbers and output the current average. How do you do this efficiently? The naive solution is to to keep track of two variables:

total   # add up all numbers seen so far
count   # a count of how many numbers seen so far

This way, at any step, you can calculate the current average using total/count.

Pages

Vocabulary

bind: Sometimes a synonym for “map”
conflate: combine
disjoint: separate, e.g., odd numbers and even numbers are disjoint
disjunction: inclusive or – if one of the inputs is True
extrinsic: not intrinsic
federated: Top-down delegation of responsibilities; Has a single point of failure at the top.
PID controller: control loop feedback mechanism; e.g., curise control
99%ile: Abbreviation of percentile
EBNF: Extended Backus-Naur Form; Useful for defining the syntax of a programming language
EBNF terminal: a token/word/chunk in EBNF
Alpha: Αα
Beta: Ββ
Gamma: Γγ
Delta: Δδ; Commonly denotes ‘difference’
Epsilon: Εε; Used in Greedy-Epsilon algo for multi-armed bandit problems; Error margin in floating point comparisons.
Zeta: Ζζ
Eta: Ηη
Theta: Θθ
Iota: Ιι
Kappa: Κκ
Lambda: Λλ
Mu: Μμ
Nu: Νν
Xi: Ξξ
Omicron: Οο
Pi: Ππ
Rho: Ρρ
Sigma: Σσς
Tau: Ττ
Upsilon: Υυ
Phi: Φφ
Chi: Χχ
Psi: Ψψ
Union: ∪
Aleph: ℵ; Symbol for cardinal numbers. ℵ is pronounced as Aleph-null.
Empty Set: ∅
Such That: Commonly represented as a colon, :; Example, D={x^2|x ∈ N, x >=1, x <= 4}. This reads D is the set of all x^2 SUCH THAT: 1) x is a natural number; 2) x is greater or equal to 1; 3) x is less than or equal to 4.
Intersection: ∩
Subset: ⊂ or ⊆; e.g., if A = {1,4,9} and B = {1,4}, then B ⊂ A (B is a subset of A).
Belongs To: ∈; ∉ Means “not belong”.; To say that 1 belongs to S, we write 1 ∈ S.; e.g., if A = {1,4,9} and e = 4, then we say e∈A, meaning “e belongs to A”. However, one would not say e⊂A – e is a single element, not a set. Similarly, if B = {1,4}, one would not say B∈A or “B belongs to A”, as B is a set not a single element.
Complements: Difference between two sets
Relative Complement: A\B means objects that belong to A and not to B. i.e., {1,2,3}{3} == {1,2}
Omega: Ωω
P(A|B): The likelihood of event A occurring given that B is true.
P(A^C): The probability that A doesn’t happen
Precision Recall: Precision = probability that some retreived doc is relevant; Recall = probability that some relevant doc was retreived.
Narrow Integration Tests: exercise only that portion of the code in a service that talks to a separate service; uses test doubles of those services, either in process or remote; thus consist of many narrowly scoped tests, often no larger in scope than a unit test (and usually run with the same test framework that’s used for unit tests)
Broad Integration Tests: require live versions of all services, requiring substantial test environment and networ access; exercise code paths through all services, not just code responsible for interactions
Balanced Binary Search Tree: For example, red-black tree or AVL tree.
Natural Numbers: ℕ; double-struck N; Cardinal numbers,
Complex Numbers: ℂ; double-struck C

https://en.wikipedia.org/wiki/Blackboard_bold

Tag: Books

August 17, 2020

Managing Humans

I finished Managing Humans from Michael Lopp today. I knew of Lopp through “Rands Leadership”, the Slack community he created. I enjoyed learning from the people in the community and thought I’d pick up his book. I’m not sure what I expected going into the book. I think I expected tips and rules that I can follow. After trying to summarize the book, I think what I wanted and got was a set of mental models that I am happy to add to my toolbox.

Tag: Management

August 17, 2020

Managing Humans

Tag: TDD

May 26, 2020

Gomock Tutorial

Tag: Java

May 19, 2020

Unit tests and system clock

It took me way to long to learn this. Your code (and their unit tests) should inject the system clock as a dependency.

An example, let’s say you have a service that writes a record to the database with the system clock.

public void save(String userName) {
	long currentTimeMs = System.currentTimeMillis();
	User user = User.builder()
	    .name(userName)
		.updateTimeMs(currentTimeMs);
	database.save(user);
}

Tag: Engineering Culture

September 3, 2019

Why We Code Review

Code review can be an important part of a team’s culture, so it is worth thinking about. If you asked me about code review two years ago, I would’ve said its key to maintaining code quality and mentoring less experienced programmers. Now I know better. Code review is much about making the code better as it is about making the team happier. How the team runs code reviews should be based on the team’s values and culture.

Generative AI 2024 Retrospective

Related

The basedir

Situational Leadership

Why Emacs is hard for new users

History

Go Errors

The basedir

No Generics

No lambda syntax

S3

History

Go Errors

An increasingly distributed and fragile world

Introduction

S3

Related