If you upgraded AKS recently (to v1.23+), you might have noticed some containers stopped working, most of the time failing to start with messages like “Error: failed to start containerd task “solr”: hcs::System::CreateProcess solr: The system cannot find the file specified.: unknown“.
As we spent quite some time researching those issues and also contacted Sitecore support, I’ve decided to write this post so it can be helpful to anyone else facing the same kind of issues.
Even if Docker runtime is still available in v1.23, it comes with Containerd by default so you will get that kind of exception. But bare in mind Docker is going to be fully removed on v1.24 so I suggest you take action as soon as possible to avoid blocking upgrading or facing issues later.
About Sitecore default images
In case you were making use of Sitecore images =< v10.0.2 then you will find those failing to start, mostly on the solr-init and mssql-init containers.
Sitecore has fixed the images for the following versions:
Please notice that if you are referring to the image versions using the “two-digit” tag, then you’re good to go, as would be getting its latest version.
About the runtime deprecation
AKS announced the deprecation of Docker in version 1.20:
Dependency on Docker explained
A container runtime is software that can execute the containers that make up a Kubernetes pod. Kubernetes is responsible for orchestration and scheduling of Pods; on each node, the kubelet uses the container runtime interface as an abstraction so that you can use any compatible container runtime.
In its earliest releases, Kubernetes offered compatibility with one container runtime: Docker. Later in the Kubernetes project’s history, cluster operators wanted to adopt additional container runtimes. The CRI was designed to allow this kind of flexibility – and the kubelet began supporting CRI. However, because Docker existed before the CRI specification was invented, the Kubernetes project created an adapter component,
dockershim. The dockershim adapter allows the kubelet to interact with Docker as if Docker were a CRI compatible runtime.
Switching to Containerd as a container runtime eliminates the middleman. All the same, containers can be run by container runtimes like Containerd as before. But now, since containers schedule directly with the container runtime, they are not visible to Docker. So any Docker tooling or fancy UI you might have used before to check on these containers is no longer available.
You cannot get container information using
docker ps or
docker inspect commands. As you cannot list containers, you cannot get logs, stop containers, or execute something inside a container using
Please refer to the official documentation for deeper details:
About our custom images
Ok, so now that things got a bit clear, and we know Sitecore base images are fixed in the latest versions (at least for v10), what about our custom ones?
So far, I’ve identified some changes required on our Dockerfile to make it work as expected in Containerd runtime.
ENTRYPOINT and CMD
The syntax is slightly different, I’ll share examples so it’s even easier to understand the changes.
This is the original Dockerfile:
ENTRYPOINT .\StartInit.ps1 -ResourcesDirectory $env:RESOURCES_PATH -SqlServer $env:SQL_SERVER -SqlAdminUser $env:SQL_ADMIN_LOGIN -SqlAdminPassword $env:SQL_ADMIN_PASSWORD -SitecoreAdminUsername $env:SITECORE_ADMIN_USERNAME -SitecoreAdminPassword $env:sitecore_admin_password -SitecoreUserPassword $env:SITECORE_USER_PASSWORD -SqlElasticPoolName $env:SQL_ELASTIC_POOL_NAME -DatabasesToDeploy $env:DATABASES_TO_DEPLOY -PostDeploymentWaitPeriod $env:POST_DEPLOYMENT_WAIT_PERIOD ` -DatabaseUsers @(...)@]
The updated one:
ENTRYPOINT ["powershell.exe", ".\\StartInit.ps1", "-ResourcesDirectory $env:RESOURCES_PATH", "-SqlServer $env:SQL_SERVER", "-SqlAdminUser $env:SQL_ADMIN_LOGIN", "-SqlAdminPassword $env:SQL_ADMIN_PASSWORD", "-SitecoreAdminUsername $env:SITECORE_ADMIN_USERNAME", "-SitecoreAdminPassword $env:sitecore_admin_password", "-SitecoreUserPassword $env:SITECORE_USER_PASSWORD", "-SqlElasticPoolName $env:SQL_ELASTIC_POOL_NAME", "-DatabasesToDeploy $env:DATABASES_TO_DEPLOY", "-PostDeploymentWaitPeriod $env:POST_DEPLOYMENT_WAIT_PERIOD", ` "-DatabaseUsers @(...)@]
Please note that now is needed to specify the shell we’re using.
Here is the updated solr-init Dockerfile:#
ENTRYPOINT ["powershell.exe", ".\\Start.ps1", "-SitecoreSolrConnectionString $env:SITECORE_SOLR_CONNECTION_STRING", ` "-SolrCorePrefix $env:SOLR_CORE_PREFIX_NAME", ` "-SolrSitecoreConfigsetSuffixName $env:SOLR_SITECORE_CONFIGSET_SUFFIX_NAME", ` "-SolrReplicationFactor $env:SOLR_REPLICATION_FACTOR", ` "-SolrNumberOfShards $env:SOLR_NUMBER_OF_SHARDS", ` "-SolrMaxShardsPerNodes $env:SOLR_MAX_SHARDS_NUMBER_PER_NODES", ` "-SolrXdbSchemaFile .\\data\\schema.json", ` "-SolrCollectionsToDeploy $env:SOLR_COLLECTIONS_TO_DEPLOY"]
I hope this helps clarify and fix your environments deployed on AKS as you get it upgraded.