Programming

2021.08.11

Hugo Images Processing

#programming #web

Hugo static website creating is a fantastic tool and I told you before. Since I changed to it, I’m very confident that the site is fast and responsive.

However, my site is packed full of images. Some are personal. Some are huge. Some are PNGs and some are JPGs. I created a gallery component just to handle posts that I want to fill with dozens of them.

Managing posts images is a boring task. For every post, I have to check:

Dimension
Compression
EXIF metadata
Naming

Dimension

Having a bigger image than the size of the screen is useless. It’s a bigger file to download, consuming bandwidth from both the user and from the server. Google Lighthouse and other site metric evaluators all recommend resizing the images to at most the screen size.

In Hugo, I’ve automated using some functions:

{{ $image_new := ($image.Resize (printf "%dx" $width)) }}

Compression

My personal photos are, most of the time, taken in JPEG. Recently I changed the default compression to HEIC for my phone camera, that provides better compression to hi-resolution photos. The web, however, does not allow such format.

Some pictures used to illustrate the posts are PNG. They have better quality at the expense of being larger. Mostly only illustrations and images with texts are worth to have a lossless format.

Whatever the format, I would like to compress as much as possible to waste less bandwidth. I’m currently inclined to use WebP, because it can really shrink the final size to a considerable amount.

{{ $image_new := ($image.Resize (printf "%dx webp" $width)) }}

EXIF metadata

Each digital image have a lot, and a mean A LOT, of metadata embedded inside the file. Day and time when it was taken, camera type, phone name, even longitude and latitude might also be included by camera app. They all reveal personal information that was supposed to be hidden.

In order to share them in the open public internet, it is important to sanitize all images, stripping then all this information. Hugo do not carry these infos along when it generates new images. So, for all images get a minimal resize, this matter is handled by default.

Naming

I would like to have a well organized image library, and it would be nice to standardize the file names. Using the post title to rename all images would be great, even more if used some caption of user provided description.

However, Hugo does not allow renaming them. To make matters even worse, it appends to each file name a hash code. A simple picture.jpeg suddenly became picture-hue44e96c7fa2d94b6016ae73992e56fa6-80532-850x0-resize-q75-h2_box.webp.

An incomprehensible mess. If you know a better way, let me know.

So What?

So, if most of the routines can be automated, that’s the problem?

The main issue is that Hugo have to pre-process ALL images upfront. As mentioned in the previous post, it can take a considerable amount of time. Especially if converted to a demanding format to compute such as WebP.

Netlify is constantly reaching the time limit to build the site, all because the thousands of image compressions. I am planning to revert some commits that I implemented WebP and rewrite them little by little, allowing Netlify to build a version a cache the results.

There are some categories of images:

Gallery full-size images: there are hundreds of them, it would take a lot of the processing time, but I will have the metadata extracted from the originals. The advantage is that they are rarely clicked and served.
Gallery thumbnails: the actual images that are shown on gallery mode. They are accountable of the biggest chunk of the main page overall size when a gallery is in the top 10 latest posts.
Post images: images that illustrate each article. They are resized to fit the whole page, so when compressed they represent a nice saving.
Post top banner: some posts have a top image. They are cropped to fit a banner-like size, so they are generally not that big.

I will, in the next couple of hours, try to implement the WebP code on each of these groups. If successfully completed, it will save hundreds of megabytes in the build.

Bonus Tip

Hugo copies all resources (image, PDF, audio, text, etc.) from the content folder to the final public/ build. Even if you only use the resized ones. Not only the build becomes larger, but the images that you wanted to hide the metadata is still online, there. Even if not directly pointed in the HTML.

A tip for those that are working with Hugo with a lot of images processed: use the following code into the content front-matter to instruct Hugo to not include these unused resources in the final build.

cascade:
  _build:
    publishResources: false

Let’s build.

Edit on 2021-08-25

I discovered that Netlify has a plugin ecosystem. And one of the plugins available is a Hugo caching system. It would speed up drastically the build times, as well the possibility of converting to WebP all images once and for all. I will test this feature right now and post the results later.

Edit on 2021-09-13

The plugin worked! I had to implement it using file configuration instead the easy one-click button. Building time went from 25 minutes to just 2. The current cache size is about 3.7 GB, so totally understandable.

It will allow me to must more frequent updates. Ok, to be frank: it will not restrict the posting frequency. However, patient, inspiration and focus are still the main constrains on blogging.

On netlify.toml file on root, I added:

# Hugo cache resources plugin
# https://github.com/cdeleeuwe/netlify-plugin-hugo-cache-resources#readme
[[plugins]]
package = "netlify-plugin-hugo-cache-resources"

[plugins.inputs]
# If it should show more verbose logs (optional, default = true)
debug = true

2021.03.14

Project Curva

#business #programming # #web

For the last 9 years, I work as a planner and controller of a multinational Brazilian oil company. The team consolidates all the planning information of the whole company, analyses it, and reports to the company board of directors.

For all these years, I’ve struggled to deal with some basic business scenarios:

At the very end of the process, someone in the chain of information submits a last-minute update that cannot be ignored
The board decides to change the plan
The existence of multiple simultaneous plans, for optimistic and pessimistic scenarios
Changes in the organizational structure

The current information systems used or developed by the company are too restrictive to accommodate their business cases. The general solution is to create entire systems using dozens of spreadsheets. It is a patchwork of data, susceptible to data loss and zero control.

To address this, I decided to develop, on my own, a new system that is both flexible and powerful. The overall core propositions are:

Versioning: instead of overwriting data whenever there is a change request, the system should be able to preserve the existing data and generate another version. Both should be accessible, in other to allow comparison and auditing.
Branching: not only sequential versioning (v1, v2, v3), it should allow users to create multiple current versions. Creating scenarios of event temporary exercises should be effortless.
Multiple dimensions: for each unit (ie, a project in a list of projects), the user could insert the future CAPEX, OPEX, production, average cost, number of workers, or any arbitrary dimension. It’s all about capturing future series of values, regardless of the meaning.
Multiple Teams: in the same organization, users can create inner teams that deal with different aspects of the business. The system should allow to users set the list of units to control (projects, employees, buildings, or whatever), their dimensions of measurement, and then control the user access to all this information. It’s a decentralized way to create plans.
Spreadsheet as a first-class citizen: small companies might not use them much. But any mid-to-big companies use spreadsheets for everything. Importing and exporting system data as Excel/LibreOffice/Google Docs is a must.

With this feature set in mind, I started to create a spear time what is now temporally called Project Curva for the last 3 months. I will post more about it in the future: the used technology, the technical challenges, and some lessons learned.

A beta is due at the end of April 2021.

Update 2021-10-18

The project is called NiwPlan and can be checked on NiwPlan.com.

2020.12.03

Hugo

#programming #web

Hello World. Testing the new site!

For the N’s time, I migrated the blog to a new blogging system. This time, I’m using Hugo.

Hugo is a class of CMS’s that generate static sites. Just like compiled and interpreted programming languages, the whole site generated beforehand and the result is uploaded to a server.

The main advantage using this method is a substantially faster site and zero attack surface form the CMS. The main disadvantages are the less user-friendly interface and big building times.

Let’s dig into theses issues:

Faster Experience

Since all the pages are now static and pre-made, the only variable it the server latency to delivery the files. The page does not need to be built on the fly for each user, which can be tremendously slow. And it also waste CPU from the server, rebuilding it time after time after time.

Most CMS’s have some caching system to mitigate this issue. They first check if the page have been already built. If so, serve it. If not, build it and save the result. The problem lies on implementing a CDN and/or a technique to invalidate the cache to force a rebuild (in case the content was altered by the author).

More Secure

Since it does not compile the page on the fly, it eliminate the security issues inherited form the language. It also does not access any type of database. There is no admin page. Event DOS attacks can be much more robust, since the CDN can migrate the traffic to another server easily.

User Interface (Lack of)

Well, Hugo uses the developer-driven approach that requires the user to use an IDE and compile the whole site. It does not offer any type of interface in which you can drag and drop widgets. It’s is definitively not WYSIWYG.

If you are seasoned to programming tools, you will have not much problem. It will be very familiar. For a non-tech savvy mom blogger, Hugo is a no go.

Build Times

Even to see a single post that you just wrote will take time. Like compiled programming languages, the site have to be built before you can check on it. Hugo have an automatic service that propagate the incremental changes and it really fast, so iterating the content will not slow you down.

It will take even more time if you have some extra processes implemented, like resizing images.

But the process to rebuild the entire site might take a while. Thankfully, for the production the whole building process can be delegated to CI/CD tools. Using GitHub or GitLab, they will automatically build the site on each commit.

The process of writing this post, the very first on the new platform, was quite nice. But I’m in the perfect spot of product requirements and technical skills

Anyway, I’m going to try to post more content in the following months. :)

2017.10.25

CGD: Awesome Video Game Data 2017

#business #game #programming

I follow the GDC (Game Developer Conference) channel on YouTube and, just right now, I recommend you to do the same. A great number of excellent talks (of course there are some exceptions, like the lame at-the-time-GDC-board-member Peter Molyneux making plain simple propaganda).

There is one that I just watched and is very eye-opening: it’s the annual talk from the guys of EEDAR (a data consolidation company) presenting numbers of the whole industry. There are talks about prices, sales, regions, and mobile/pc/consoles. Everything!

It is a must-see.

2017.07.13

Linux on Notebook, Take 2, Mini-Buntu

#game # #personal #programming

My notebook is not new. I bought the Yoga 2 Pro almost 4 years ago. Two years back, I got annoyed with Windows, so I decided to install Linux in it. I was scared because on the contrary of most my PCs that I assembled myself, the Lenovo had a warranty and possibly custom hardware.

As I told, the attempt failed. It was giving me too many headaches. Also, I generally use my notebook to also program and develop games. And because the Unity Editor was not available (not at least in a reasonable version), I was kinda forced to migrate back to Windows10.

About 3 months ago, I decided to give it a second shot. In case I was not clear, I use Linux in the desktop, in a dual boot, for about 15 years. I saw Ubuntu entering the market. But since I start to systematically be involved on making games, the necessity of Windows started too. Back to the experiment. It was a requirement for me that the general performance had to be great. Not good, great. I would prefer to keep on the Debian-like distro because I’m familiar to. Ubuntu family if possible. So I selected both Kubuntu and Lubuntu for a ride.

Kubuntu was the one that I tested before. I like KDE since version 2 but again failed in deliver a blazing fast experience. In the notebook, the boot time was several minutes. Even Windows 10 was a couple of seconds. I decided then to format and install Lubuntu.

Lubuntu is an Ubuntu derivative using the LXDE desktop environment. Super light. Man! Boot was fast and when ready it consumed a fraction of RAM of both Windows and Kubuntu. However, during my 4 weeks test I was giving too many little problems. So I decided to make another switch.

Xubuntu is fine in a 13 inches monitor. Then came to the software selection. Lubuntu was super short on preinstalled stuff, which I like because I generally don’t use them anyway, but Xubuntu came with some. The good news is that the selection does not consume much of the drive space and are light enough in case I really want to use them.

I had to install Steam and it works nice. Unfortunately, GOG’s Galaxy does not have currently a Linux version, so the games have to be installed manually one by one. Also, your play time will be not computed, nor you will be alerted about updates. A second negative point is that most GOG’s games do not use the new cloud save feature, so playing a bit in the notebook and a bit in the desktop is only for games that progress do not matter. Fingers crossed for the future.

Finally, I was looking for a game engine that works on Linux. Unreal, as I found, works, but you have to compile it yourself. GREAT 🙁 I did it. It took hours and the result was too many crashes and too big suite to work in a notebook. I was once again looking for a lightweight engine. I tested Godot and liked. But it is still lacking.

Then I found out that Unity is, in fact, releasing in an alternative channel (through forums) the update engine for Linux. I installed it too. Crashes a lot but it works. I’ve being playing the game developer in the notebook ever since. With the excellent Visual Studio Code editor, it makes my days fun.

After 2 months and half working most of the time on this notebook, I can be happier man but in general I am already one. It is fast, close environment that I face when I deal with cloud Internet stuff and free. I plan to migrate to a newer machine in the next year, mostly to get a better amount of RAM memory and battery life. Currently, it lasts 3 hours, which is by any means a shame for a mobile device.