I was fighting with unsloth / Nvidia Cuda / Python versioning today. I eventually gave up on unsloth and managed to use pytorch directly - however as part of the journey I found that Microsoft, Nvidia and Linux have made GPU paravirtualisation so smooth it's invisible.

First of all, it just works in WSL - you can just install the Nvidia drivers (inside WSL) and magically get full GPU support, with no prior configuration of how much of the GPU you are taking, it's shared dynamically and very smoothly.  You can just use the GPU as if it's local once you've installed the drivers.

Which is clearly intended to handle container tech and brings us on to the Nvidia container toolkit which provides the local libraries for the cotnainers, and Podman GPU support.

  • Install CUDA and the Nvidia drivers on the host
  • Install the CUDA and Nvidia drivers on the WSL instance / VM, or even the Nvidia Container Toolkit.
  • Install the Nvidia Container Toolkit on the docker machine/podman machine in WSL ("podman machine ssh" to get into it)
  • Use "--gpus all" on podman when running a container to enable the GPU's to be passed through to the container!

Overall I was very surprised, as Nvidia historically put a lot of road blocks on paravirtualisation of their cards.  It's great to see this working.

Regarding the situation with unsloth, I found it was breaking even though the libraries were installed - and think different bits have different dependencies.  It could do with some re-engineering to guarantee it's consistency, even their own containers weren't working and I tried several tags.

Instead, I created my own micro-environment to work in with a Dockerfile based on Nvidia's pytorch package:-

FROM nvcr.io/nvidia/pytorch:25.09-py3
EXPOSE 8888
WORKDIR /app
RUN pip install jupyter
CMD ["jupyter", "notebook", "--ip=0.0.0.0", "--port=8888", "--no-browser", "--allow-root"]

Once this had been built with a quick:-

podman build -t aiplatform .

I could run the container using

podman run --gpus all -p 8888:8888 aiframework

And then access a Jupyter workbook with a fully functional CUDA enabled pytorch in it.  This allowed me to get on and sort out what I wanted (I'm porting some of my old AI tech into the current LLM model ecosystem).

One thing I did find is that if I used DockerCLI I wasn't able use the --gpus parameter, if I wanted to use the docker command, I had to remove DockerCLI and then I had to symlink podman to docker with a quick:-

mklink "C:\Program Files\RedHat\Podman\docker.exe" "C:\Program Files\RedHat\Podman\podman.exe"
Permalink 

I've upgraded the site to the latest .NET version, importantly there was a vulnerability in Kestrel (the embedded webserver) so it's important to keep updated.

This is also a good proof of life for anyone reading.  Hope everyone is well!

Outside of work (for my own company), I'm currently working on building AI related agents and components to build a highly persistant and adaptive AI - this is all the rage so nothing special and there's tons of examples of agentic development going on even though I've been working on it since 1998 on and off.  It's good to see ideas come to fruition gradually even if I'm not always the one making the achievemnt first - hopefully there'll be some achievements worthy of talking about soon.

Inside work (at my day-to-day employer), we're building machine learning for a very specific purpose and maintaing a suite of applications for internal use - there's not much to talk about.

Permalink 

While .Net has fully native Postgres support using the Npgsql library, it requires using the approach of telling the connection that you require a read-write intent in code and not the connection string, so you end up with a normal connection string with a list of servers:-

    Host=server1.example.org,server2.example.org;Username=myusername;Password=mypassword;Database=mydatabase

This connection string will use both server1 and server2 to handle all requests if used by default, however if you use the NpgsqlDataSourceBuilder and call CreateConnection, you can specify TargetSessionAttributes.ReadWrite as a parameter to guarantee you get a writeable master server:-

    var dsBuilder = new NpgsqlDataSourceBuilder(Configuration.GetValue<string>("Database:ConnectionString"));
  var ds = dsBuilder.BuildMultiHost().CreateConnection(TargetSessionAttributes.ReadWrite);

This will enable you to have read, and read-write intent on your applications connection to the database.

But what happens if you're not using .Net and want to interact with the writeable server first?

It turns out there's a target_session_attrs parameter:-

    postgresql://server1.example.org,server2.example.org/mydatabase?target_session_attrs=any
postgresql://server1.example.org,server2.example.org/mydatabase?target_session_attrs=read-write

And this is an option on the ODBC driver too:-

    Driver={PostgreSQL};Server=server1.example.org,server2.example.org;Port=5432;Database=mydatabase;Uid=myusername;Pwd=mypassword;target_session_attrs=read-write;

There are other parameters you can use here too:-

any
Connect to any server, this is the default
read-write

Connect to any server that is writeable

read-only
Connect to any server that is in read-only mode, you can either have a server set to standby or set to read-only explicitly
primary
Connect to the primary server
standby
Connect to any standby servers
prefer-standby
Try to connect to a standby server first, and then use the primary if no connection could be made.  I have read it falls back to any so if you have no replication it will still find a system.
Permalink 

During my switch to Postgres, I also deployed an active/passive physical replication cluster, at the time on Postgres 16.  I wanted to upgrade to Postgres 17, so did a backup:-

    pg_dumpall > 20241209-PreUpgradeBackup.bak

Created a new Postgres 17 server, and restored the dump there, to allow me to check and test that PG17 works with everything:-

    /usr/local/pgsql/bin/psql -d postgres -f ~/backups/20241209-PreUpgradeBackup.bak

But this would require a data loss period in production which could be unacceptable, and when upgrading postgres, it is also important to rebuild your replicas from the new instance.  This provides the opportunity to use it as a means to do a seamless, zero downtime upgrade - and it's also a good way to upgrade each server separately.  I also considered this as an opportunity to test my failover processes so I downed my master server from the pool [1] (so I can bring it back up if there's a problem with the upgrade) and ran this on a secondary server:-

    SELECT pg_promote(); 

Which makes the instance writeable, instantly failing over (using the connection string approach discused in my previous blog post on migrating to Postgres, the applications will now switch to writing to this location) - and checked everything runs smoothly for DR purposes, installed Postgres 17 on a new instance, configured the pg_hba.conf file to allow replication from the new-master, and ran:-

    pg_basebackup -h yourmasterservername.yourdomain.com -U replica_user -X stream -C -S yourreplicationslotname -v -R -W -D /var/lib/postgresql/17/data/
Remember to run this as your postgres user, or you will need to chown -R postgres /var/lib/postgresql/17/data afterwards.
 
This will recreate the entire data directory and populate it with a clone of the master server, setting the server to a restoring state so that it's a valid secondary.  If you want to have your replicas support reading, then set the hot_standby setting to on in the postgresql.conf file.  You can also then wipe your original master, and re-add it as a secondary if you care about re-using the machine/VM.
 
[1] - The pooling I'm using is a custom written utility that monitors and updates my DNS automatically when there's a change in state on the servers, if you are in Azure there's a Virtual Endpoints feature that does this, or you can use a Postgres proxy, there's a few of them around.  If you are in an enterprise situation then your load balancers can probably handle this.

Permalink 

Once you've installed PodMan, run the following command in the Windows Terminal to get the Docker CLI installed:-

    winget.exe install --id Docker.DockerCLI

With this added, Visual Studio will stop complaining about the lack of Docker Desktop, and start to run things, the container tools will also work.  You can also then change the settings which start/stop Docker Desktop in Visual Studio's Tools, Options menu.

For those in a corporate environment, this should save a lot of money on Docker Desktop seats!

Permalink 

After more issues with MS SQL AlwaysOn due to a fibre provider problem bringing one node down, I decided to move my blog to run off a Postgres cluster with read/write targetting.  Thankfully I'd been planning to do this for a long time and had already set up the database, schema, tables and replicated the data.

I found a few interesting things in doing this and got to use the new KeyedServices feature in .NET 8, since I wanted to have the option to alternating between MS SQL and Postgres and alternating between read and read/write contexts, so I had to make the following changes:-

  1. I created two derived classes from my EF Core model and moved all the Fluent operations into them, initially I just duplicated them.
  2. In the Postgres implementation, I added SnakeCaseNamingConvention, a Default Schema and the citext extension (for case insensitive page names).
  3. Moved all interactions with both EF Core contexts to an interface and extracted all used methods and DBSet<T> declarations to the interface.
  4. In the Startup, I put a condition on which Database:Platform setting was supplied and then set up a scoped database context.  Postgres is very interesting here in that you explicitly set your TargetSessionAttributes on setting up your connection.  Historically it was in the connection string like with MS SQL, but this limits you to only one mode of operation per connection string - now it's dynamic and driven through a DataSourceBuilder.
  5. I also then put a KeyedScoped context with the read-write property set.
  6. Then tagged the constructor parameters that needed the new ReadWrite instance of the context.
  7. Ran things and found it broke in surprisingly few places, mostly where DateTime.Now was used.  I converted these to DateTime.UtcNow like it should be - and it was happy and worked.

Let's go into more depth on these steps...

Context Changes

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
    //Requires the addition of the EFCore.NamingConventions Nuget Package
    optionsBuilder = optionsBuilder.UseSnakeCaseNamingConvention();
    base.OnConfiguring(optionsBuilder);
}

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
  //Since I'm using one database with multiple sub-schemas now
//MSSQL defaults to dbo and schemas are less strict, Postgres really likes having one specified.
    modelBuilder.HasDefaultSchema("blog");
    //Provides case insensitivity
    modelBuilder.HasPostgresExtension("citext");

    //... the rest of your modelBuilder Fluent entity setup

  modelBuilder.Entity<PageContent>(entity => {
         //Enable case insensitive searching on this field, the column in the database has been changed to this type too.
         entity.Property(e => e.PageId).HasColumnType("citext"); 
         // ... more setup
 });
}

Base Context Changes

The base context needed to have a new method to allow it to accept both the MSSQL and the Postgres DBContextOptions<T> types and the new interface - then it was just a list of DbSet<T> properties for the various table representations:-

public partial class NullifyDBEntities : DbContext, INullifyBlog
{
  public NullifyDBEntities() { }
  public NullifyDBEntities(DbContextOptions<NullifyDBEntities> options) : base(options) { }
  protected NullifyDBEntities(DbContextOptions options) : base(options) { }

It was crucial to make the new constructor protected, so that the EFCore components don't break as they expect to only have the generic type passed in that matches the current class.  Since this base context isn't ever going to be called, I'm tempted to change it to protected but I'm wary about the rest of EF being a little unhappy with that.

The New Interface

While most of the interface is specific to my data types for the blog, the were also a number of operations I wanted to expose on the interface to make it easier to call them.  Because of that, I exposed things like SaveChanges, Add, Attach, Update, Remove and SaveChangesAsync - plus inherited from IDisposable and IAsyncDisposable.  Nothing worth providing a code example for, but for completeness:-

public interface INullifyBlog : IInfrastructure<IServiceProvider>, IResettableService, IDisposable, IAsyncDisposable
{
  DbSet<PageContent> PageContent { get; set; }
  // ... lots of other DbSet<T> properties

  int SaveChanges();
  // ... and then a few more utility methods.
}

Controller Changes

The controller changes were extremely minimal, I didn't need to change any business logic - just the type the constructor received (and the property the base class stored), and add the new FromKeyedServices attribute where needed.

    [Authorize(Roles = "Editor")]
    public class ManageArticlesController : BaseController
    {

        //The new attribute that provides keyed services.
        public ManageArticlesController([FromKeyedServices("ReadWrite")] INullifyBlog context) : base(context)
        {
        }

This was surprising, not having to change reams of historic code was great.

One thing I did find when I ran the first time was that I had runtime errors, I had to use UTC for all my dates and times.  Thankfully there were only a few places where I'd used local times to allow future-dated posting:-

[ResponseCache(Duration = 60, VaryByHeader = "id", VaryByQueryKeys = new string[] { "User" })]
public ActionResult Index(int id)
{
  var article = (from a in Database.NewsArticle
                where a.Id == id
               && a.PostedAt < DateTime.UtcNow
                select new NewsArticleViewModel { Article = a, CommentCount = a.Comment.Count, Author = a.U.Username, CanEdit = false }).SingleOrDefault();

  if (article == default(NewsArticleViewModel))
  {
    return StatusCode(404, "Unable to find an article with that ID");
  }
  else
  {
    return View(article);
  }
}

The people writing the Postgres SQL client have quite rightly made the choice to warn you if you use a date affected by a timezone - the number of times not using Utc in a database has bitten people is really high.

Startup Changes

And in the startup class I added support for the new Postgres based datacontext, while retaining the original MSSQL context if needed.

   if (Configuration.GetValue<string>("Database:Platform") == "Postgres")
 {
     services.AddDbContext<NullifyDbPGContext>(
         options => options.UseNpgsql(Configuration.GetValue<string>("Database:ConnectionString"))
     );
     services.AddScoped<INullifyBlog, NullifyDbPGContext>();

     //Enables read-write operations on only those servers that support it, and automatically selects that server when connecting, otherwise it will use any working server in the cluster.
//The new keyed attribute here makes things really neat.
     services.AddKeyedScoped<INullifyBlog>("ReadWrite", new Func<IServiceProvider, object, NullifyDbPGContext>((services, o) =>
     {
         //Uses the Postgres specific data source builder to connect to the correct servers on demand.
         var dsBuilder = new NpgsqlDataSourceBuilder(Configuration.GetValue<string>("Database:ConnectionString"));
         var ds = dsBuilder.BuildMultiHost().CreateConnection(TargetSessionAttributes.ReadWrite);
         var builder = new DbContextOptionsBuilder<NullifyDbPGContext>()
                             .UseNpgsql(ds);
         var context = new NullifyDbPGContext(builder.Options);
         return context;
     }));
   }

Docker Environment Changes

And finally, the docker container can now have Database__ConnectionString and Database__Platform supplied so that these settings can be per environment.  Importantly I moved from using : on the environment variables to __ so that it's consistent across all platforms:-

services:
www:
  image: ServiceProvider.tld/ImageName:Tag
  ports:
    - "80:8080"
  environment:
    Database__ConnectionString: YourConnectionStringHere
    Database__Platform: Postgres
#Your replication settings and network properties here

And a reminder that the containers port number has changed from port 80 internally on prior .NET versions Docker support, to now use port 8080.  This is accompanies by the containers no longer running the dotnet process as root so they're more secure but also harder to troubleshoot.  There's still an option to revert this behaviour but I don't recommend you do it.

Conclusion

Overall, this was a surprisingly light amount of work, all done in an hour with more time spent modernising the deployment pipeline and writing this post - especially given I'd completely redone the schema too to use lowercase snake case during the database migration.  Good fun!

And this codebase has now been upgraded/modernised significantly every year since August 1999 making it 25 years old, which is apt as it was my 41st a few days ago.

Permalink 

Upgrading to .NET 8 revealed a gotcha in the new docker containers, they changed the port number of the webserver inside the container from port 80 to port 8080, so you need to update any compose configurations you have or port mappings.

The link discussing this is here.

Permalink 

I'll preface this by saying this post is just for me, it's for Alpine Linux and specific to one use case where I need a static file webserver but want to reconfigure the domains occasionally.

You can set up self-reloading config if you have a script that watches for file changes to the conf files. Just apk add inotify-tools and then create a script like this in /etc/inotify/nginx-reload.sh and make it chmod u+x nginx-reload.sh:-

#!/bin/bash

set -e

while true; do
    while inotifywait /var/www -e modify --include '.*\.conf'; do
        nginx -s reload
    done
done

You then need to set this up to run on startup using openrc. Put the following in /etc/init.d/nginxreload and again, chmod it u+x:-

#!/sbin/openrc-run

name="Nginx Reloader"
command="/etc/inotify/nginx-reload.sh"
command_args=""
command_user="root"
command_background=true
pidfile="/etc/inotify/nginx-reload.pid"

depend() {
        need net localmount
}

Now run:-

rc-update add nginxreload default
service nginxreload start

And any edits to the conf files specified result in nginx reloading its configuration.

Permalink 

Are you having issues with loading the designer and getting InvalidOperationException and a NamedPipeTimeout trying to connect?

Open a PowerShell in Admin mode and run this to set an exclusion in Defender for the design server:-

Add-MpPreference -ExclusionProcess 'DesignToolsServer.exe'
Permalink 

I just upgraded this site to .NET 6, I thought it was over due with the release of Visual Studio 2022 having happened a while ago now.  Great to see it was a quick and easy upgrade, the performance has improved even more on the new version too.  I just updated the build version, compiled, then fixed some warnings caused by RenderPartial, I'm pleased to see they sorted that deadlock!

I also took the time to switch to Alpine Linux as the docker container hosting it as it's my preferred Linux distribution, however found I needed to solve the ICU issue (the MS SQL Server client expecting to have localisation libraries) by getting the packages during the container build:-

FROM mcr.microsoft.com/dotnet/aspnet:6.0-alpine AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443
RUN apk add --no-cache icu-libs
ENV DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=false

Oddly when building the ARM64 containers for the dev version of the site that runs on a pair of Raspberry PI's I also encountered a problem with the cross-compilation of docker containers, but that was easily fixed by running the below on the host building the containers as part of the pipeline:-

docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
This allowed the ARM components inside the 6.0-alpine-arm64v8 container to work so the apk command didn't error.
 
Other than the above it was seamless, and it's nice to switch to an even smaller container size.
 
Permalink 

Happy new year to anyone still reading this, it has been more than 21 years I've written on this blog, varying from extensive articles to years where I've not felt the need to say anything at all (including last year!).  I just this week completely rebuilt this blog to run on .NET 5 and be hosted under Docker, no changes to paths, content, etc. but it's good to keep things current and modern and each rewrite I distill it further and make it simpler.

With the move to a more digital way of running businesses and all the working from home it's an interesting time for a technologist.  A decade ago this would have been possible but unlikely to be something we'd end up doing for real or such a prolonged period, now it suddenly normal.  It seemed appropriate to confirm all is well for anyone that still does look.

Life/work update: I've been managing a team of developers for the last few years now, I am still staying as technical as I can and run regular training sessions for my team.  I'm doing fairly well.

Permalink 

For the last year I went permanent and I've been a Delivery Lead managing about 30 people and a swathe of business applications.  It's good fun, and I've still got the ability to be really deeply technical or help my team members with actual programming at times - I'm glad to have a great team who can do most things on their own.

I've just updated this site with the latest version of the code that runs it (I did a complete re-write, though the layout is exactly the same for the moment).  If anyone spots anything broken please e-mail me.

I'm still around and doing fine.  My company (Fizl Ltd) is still in existance and sitting there ready for me to use for things.

Permalink 

I can't believe it's still a thing; I've encountered this error repeatedly since the release of Windows Identity Foundation and have just encountered it again with Azure AD and MVC.  You just have to add this somewhere before using an AntiForgeryToken in MVC:-

AntiForgeryConfig.UniqueClaimTypeIdentifier = "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name";

To get past the error:-

A claim of type 'http://schemas.xmlsoap.org/ws/2005/05/identity/claims/nameidentifier' or 'http://schemas.microsoft.com/accesscontrolservice/2010/07/claims/identityprovider' was not present on the provided ClaimsIdentity.

Permalink 

Well I had a very bad year with my father passing away and various business related things like the umbrella company I was using going insolvent and taking an amount of my money with it (and I expect a few more road bumps over the next few months as things settle down) I'm hoping for a better year going forward.  Many of the things that happened have just pushed me to adapt.

I know I haven't posted blog articles properly in years as I've been constantly trying to avoid posting work related items (as I've been contracting for others), I've been doing some interesting research myself recently and hope to talk more about that instead soon though there are a few hurdles to overcome before there's anything useful produced from it.  I do need to get back in the habbit of doing my own things though, I have buried my head into my customers work like it was sand without thinking of my own reality and that needs to change.

I have lots of ideas for Fizl Ltd to do in the future too so I wish any readers that remain well and hope you will remain patient - pretty easy with an RSS reader.

First things first though, this site needs an upgrade and some new content...

Permalink 

I'm still alive and contracting, I've been radio silent for some time because I don't really want to talk about anything related to my employer on here and hadn't done much personal software development for a while.  About six months ago I started to do some again so hopefully will resume posting.

I've just finished a full set of server upgrades for my off-site physical servers (nice new hardware, and to Windows Server 2012r2), if you were hosted on them and haven't contributed for more than ten years or so your accounts may not work when you try to get in as I didn't migrate those people.  I may still have copies of your content if you ask before the old servers are turned off.

I still have spare capacity for friends to use on the servers (much more than previously actually).  For those of you using the teamspeak server, it's still on on the same address beta.nullify.net but the IP will have changed.  Let me know if there are any issues.

Permalink