Exportable Linux virtual hard-drives for Hyper-V

Thursday, September 14, 2017 | Posted in Hyper-V Preseed Ubuntu Unattend

As part of learning more about infrastructure creation, testing and deployment one of the projects I'm working on is creating a set of virtual machine images for Windows and Linux which can be used as a base for more complex virtual machine based resources, e.g. a consul host or a docker host.

The main virtualization technology I use is Hyper-V on both Windows 10 and Windows 2016 which allows creating Generation 2 virtual machines. Some of the benefits of a generation 2 virtual machine are:

  • Boot volume up to 64 Tb
  • Use of UEFI for the boot process
  • Faster boot

The initial version of the base resources allowed creating a virtual machine with Packer and exporting that virtual machine to be used as a base. However ideally all one would need is the virtual hard-drive. The virtual machine configuration can easily be created for each individual resource and the configuration is usually specific to the original host by virtue of it containing the absolute path of the virtual hard drive, the name of the network interfaces etc..

When building Ubuntu virtual disk images one of the issues with using a Generation 2 virtual machine is that it uses UEFI for the boot process. It turns out that the Ubuntu install process stores the UEFI files in the virtual machine configuration file. This means that when one creates a new virtual machine from the base virtual disk image it runs into a problem when booting because the boot files are not present in the new machine. The result is this

Hyper-V error message due to missing UEFI sector

The solution to this issue obviously is to force the Ubuntu installer to write the UEFI files to the virtual hard disk which can be achieved by adding the correct configuration values to the preseed file. Unfortunately the documentation for the different options in the preseed files is hard to find. In the end a combination of the ubuntu sample preseed file, bug reports, old forum messages and a few blog posts allowed me to determine that to make the Ubuntu installer place the UEFI files in the correct location two parts of the preseed file needed to be changed from the default Ubuntu one. The first part is the partitioning section which requires that at least an EFI partition and (most likely) a boot partition are defined. This almost requires that a custom recipe is defined. The one I currently use looks as follows:

# Or provide a recipe of your own...
# If not, you can put an entire recipe into the preconfiguration file in one
# (logical) line. This example creates a small /boot partition, suitable
# swap, and uses the rest of the space for the root partition:
d-i partman-auto/expert_recipe string       \
    grub-efi-boot-root ::                   \
        1 1 1 free                          \
            $bios_boot{ }                   \
            method{ biosgrub }              \
        .                                   \
        256 256 256 fat32                   \
            $primary{ }                     \
            method{ efi }                   \
            format{ }                       \
        .                                   \
        512 512 512 ext4                    \
            $primary{ }                     \
            $bootable{ }                    \
            method{ format }                \
            format{ }                       \
            use_filesystem{ }               \
            filesystem{ ext4 }              \
            mountpoint{ /boot }             \
        .                                   \
        4096 4096 4096 linux-swap           \
            $lvmok{ }                       \
            method{ swap }                  \
            format{ }                       \
        .                                   \
        10000 20000 -1 ext4                 \
            $lvmok{ }                       \
            method{ format }                \
            format{ }                       \
            use_filesystem{ }               \
            filesystem{ ext4 }              \
            mountpoint{ / }                 \
        .

Note that syntax for the partioner section is very particular. Note especially the dots (.) at the end of each section. If the syntax isn't completely correct nothing will work but no sensible error messages will be provided. Additionally the Ubuntu install complained when there was no swap section so I added one. This shouldn't be necessary to get the UEFI files in the correct location but it is apparently necessary to get Ubuntu to install in the first place.

The second part of the preseed file that should be changed is the grub-installer section. There the following line should be added

d-i grub-installer/force-efi-extra-removable boolean true

This line indicates that grub should force install the UEFI files, thus overriding the normal state of not installing the UEFI boot files.

This means that the complete preseed file looks as follows

# preseed configuration file for Ubuntu.
# Based on: https://help.ubuntu.com/lts/installation-guide/armhf/apbs04.html

#
# *** Localization ***
#
# Originally from: https://help.ubuntu.com/lts/installation-guide/armhf/apbs04.html#preseed-l10n
#

# Preseeding only locale sets language, country and locale.
d-i debian-installer/locale string en_US.utf8

# Keyboard selection.
# Disable automatic (interactive) keymap detection.
d-i console-setup/ask_detect boolean false
d-i console-setup/layout string us

d-i kbd-chooser/method select American English

#
# *** Network configuration ***
#
# Originally from: https://help.ubuntu.com/lts/installation-guide/armhf/apbs04.html#preseed-network
#

# netcfg will choose an interface that has link if possible. This makes it
# skip displaying a list if there is more than one interface.
d-i netcfg/choose_interface select auto

# If you want the preconfiguration file to work on systems both with and
# without a dhcp server, uncomment these lines and the static network
# configuration below.
d-i netcfg/dhcp_failed note ignore
d-i netcfg/dhcp_options select Configure network manually

# Any hostname and domain names assigned from dhcp take precedence over
# values set here. However, setting the values still prevents the questions
# from being shown, even if values come from dhcp.
d-i netcfg/get_hostname string unassigned-hostname
d-i netcfg/get_domain string unassigned-domain

# Disable that annoying WEP key dialog.
d-i netcfg/wireless_wep string


#
# *** Account setup ***
#
# Originally from: https://help.ubuntu.com/lts/installation-guide/armhf/apbs04.html#preseed-account
#

# To create a normal user account.
d-i passwd/user-fullname string localadmin
d-i passwd/username string localadmin

# Normal user's password, either in clear text
d-i passwd/user-password password reallygoodpassword
d-i passwd/user-password-again password reallygoodpassword

# The installer will warn about weak passwords. If you are sure you know
# what you're doing and want to override it, uncomment this.
d-i user-setup/encrypt-home boolean false
d-i user-setup/allow-password-weak boolean true

# Set to true if you want to encrypt the first user's home directory.
d-i user-setup/encrypt-home boolean false


#
# *** Clock and time zone setup ***
#
# Originally from: https://help.ubuntu.com/lts/installation-guide/armhf/apbs04.html#preseed-time
#

# Controls whether or not the hardware clock is set to UTC.
d-i clock-setup/utc boolean true
d-i clock-setup/utc-auto boolean true

# You may set this to any valid setting for $TZ; see the contents of
# /usr/share/zoneinfo/ for valid values.
d-i time/zone string UTC


#
# *** Partitioning ***
#
# Originally from: https://help.ubuntu.com/lts/installation-guide/armhf/apbs04.html#preseed-partman
#

# This makes partman automatically partition without confirmation, provided
# that you told it what to do using one of the methods below.
d-i partman/choose_partition select finish
d-i partman/confirm boolean true
d-i partman/confirm_nooverwrite boolean true

# In addition, you'll need to specify the method to use.
# The presently available methods are:
# - regular: use the usual partition types for your architecture
# - lvm:     use LVM to partition the disk
# - crypto:  use LVM within an encrypted partition
d-i partman-auto/method string lvm
d-i partman-auto/purge_lvm_from_device boolean true

# If one of the disks that are going to be automatically partitioned
# contains an old LVM configuration, the user will normally receive a
# warning. This can be preseeded away...
d-i partman-lvm/device_remove_lvm boolean true
d-i partman-lvm/device_remove_lvm_span boolean true

# And the same goes for the confirmation to write the lvm partitions.
d-i partman-lvm/confirm boolean true
d-i partman-lvm/confirm_nooverwrite boolean true

# For LVM partitioning, you can select how much of the volume group to use
# for logical volumes.
d-i partman-auto-lvm/guided_size string max
d-i partman-auto-lvm/new_vg_name string system

# You can choose one of the three predefined partitioning recipes:
# - atomic: all files in one partition
# - home:   separate /home partition
# - multi:  separate /home, /usr, /var, and /tmp partitions
d-i partman-auto/choose_recipe select grub-efi-boot-root

d-i partman-partitioning/confirm_write_new_label boolean true

# If you just want to change the default filesystem from ext3 to something
# else, you can do that without providing a full recipe.
d-i partman/default_filesystem string ext4

# Or provide a recipe of your own...
# If not, you can put an entire recipe into the preconfiguration file in one
# (logical) line. This example creates a small /boot partition, suitable
# swap, and uses the rest of the space for the root partition:
d-i partman-auto/expert_recipe string       \
    grub-efi-boot-root ::                   \
        1 1 1 free                          \
            $bios_boot{ }                   \
            method{ biosgrub }              \
        .                                   \
        256 256 256 fat32                   \
            $primary{ }                     \
            method{ efi }                   \
            format{ }                       \
        .                                   \
        512 512 512 ext4                    \
            $primary{ }                     \
            $bootable{ }                    \
            method{ format }                \
            format{ }                       \
            use_filesystem{ }               \
            filesystem{ ext4 }              \
            mountpoint{ /boot }             \
        .                                   \
        4096 4096 4096 linux-swap           \
            $lvmok{ }                       \
            method{ swap }                  \
            format{ }                       \
        .                                   \
        10000 20000 -1 ext4                 \
            $lvmok{ }                       \
            method{ format }                \
            format{ }                       \
            use_filesystem{ }               \
            filesystem{ ext4 }              \
            mountpoint{ / }                 \
        .

d-i partman-partitioning/no_bootable_gpt_biosgrub boolean false
d-i partman-partitioning/no_bootable_gpt_efi boolean false

# enforce usage of GPT - a must have to use EFI!
d-i partman-basicfilesystems/choose_label string gpt
d-i partman-basicfilesystems/default_label string gpt
d-i partman-partitioning/choose_label string gpt
d-i partman-partitioning/default_label string gpt
d-i partman/choose_label string gpt
d-i partman/default_label string gpt

# Keep that one set to true so we end up with a UEFI enabled
# system. If set to false, /var/lib/partman/uefi_ignore will be touched
d-i partman-efi/non_efi_system boolean true


#
# *** Package selection ***
#
# originally from: https://help.ubuntu.com/lts/installation-guide/armhf/apbs04.html#preseed-pkgsel
#

tasksel tasksel/first multiselect standard, ubuntu-server

# Minimum packages (see postinstall.sh). This includes the hyper-v tools
d-i pkgsel/include string openssh-server ntp linux-tools-$(uname -r) linux-cloud-tools-$(uname -r) linux-cloud-tools-common

# Upgrade packages after debootstrap? (none, safe-upgrade, full-upgrade)
# (note: set to none for speed)
d-i pkgsel/upgrade select none

# Policy for applying updates. May be "none" (no automatic updates),
# "unattended-upgrades" (install security updates automatically), or
# "landscape" (manage system with Landscape).
d-i pkgsel/update-policy select none

# Language pack selection
d-i pkgsel/install-language-support boolean false

#
# Boot loader installation
#

# This is fairly safe to set, it makes grub install automatically to the MBR
# if no other operating system is detected on the machine.
d-i grub-installer/only_debian boolean true

# This one makes grub-installer install to the MBR if it also finds some other
# OS, which is less safe as it might not be able to boot that other OS.
d-i grub-installer/with_other_os boolean true

# Alternatively, if you want to install to a location other than the mbr,
# uncomment and edit these lines:
d-i grub-installer/bootdev string /dev/sda
d-i grub-installer/force-efi-extra-removable boolean true


#
# *** Preseed other packages ***
#

d-i debconf debconf/frontend select Noninteractive
d-i finish-install/reboot_in_progress note

choose-mirror-bin mirror/http/proxy string

The complete preseed file can also be found in the http preseed directory of the Ops-Tools-BaseImage project. This project also publishes a NuGet package which has all the configuration files and scripts that were used to create the Ubuntu base virtual hard drive.

Software development pipeline - Design accuracy

Monday, September 4, 2017 | Posted in Delivering software DevOps Pipeline design Software development pipeline

ISO defines accuracy as the combination of correctness, in agreement with the true facts, and consistency, always behaving or performing in a similar way.

The reason to value accuracy as the number one characteristic of the development pipeline is because it is important for the development teams to be able to rely on the outputs of the pipeline, whether they are product artefacts, test results or output logs. Without the accuracy the development teams will eventually lose their trust in the development pipeline, meaning that they will start ignoring the results because the teams assume that a failure is one of the system instead of one caused by the input set. Once the development teams lose the trust in the pipeline it will take a lot of work to regain their trust.

Once we know that having a development pipeline which delivers correct results is important the next step is to determine how accuracy can be built into the development pipeline. In theory this task is a simple one, all one has to do is to ensure that all the parts that form the pipeline behave correctly for all input sets. However as indicated by many -

In theory there is no difference between theory and practice. In practice there is

which means that practically achieving accuracy is a difficult task due to the many, often complex, interactions between the pipeline components. As a reminder the components the development pipeline consists of are:

  • The scripts that are used during the different parts of the cycle, i.e. the build, test and release scripts.
  • The continuous integration system which is used to execute the different scripts.
  • The tools, like the compiler, test frameworks, etc.

Based on this categorization of the pipeline parts and the previous statement one possible way of approaching accuracy for a development pipeline is to first ensure that all the parts are individually accurate. As a second stage the changes in accuracy due to interaction between the parts can be dealt with.

For the scripts, tools and continuous integration system this means that each input returns a correct response and does so consistently for each input set. Fortunately most scripts and tools do so for the majority of the inputs. In cases where a tool returns an incorrect response the standard software development process should be followed by recording an issue, scheduling the issue and implementing, testing and deploying a new version of the tool. In this process it is important to test thoroughly to ensure that the changes do not negatively impact tool accuracy. Additionally it is important to execute both (automated) regression testing against known input sets as well as high level (automated) smoke tests of the entire development pipeline to validate that the issue has been fixed and no further issues have been introduced. In order to minimize disruption to the development teams tests should be conducted outside business hours if no test environment is available, i.e. if the production development pipeline has to be used for the final tests. It is of course better if a test environment is available so that testing can take place during business hours without affecting the development teams. As a side note; having a test environment with a copy of the development pipeline allows for the development of new features and other changes while the production pipeline is in use, thus making it easier and quicker to evolve the pipeline and its capabilities.

With the approach to development and improvement of the tools taken care of the other area that needs to be carefully controlled is the infrastructure on top of which the development pipeline executes. For infrastructure the biggest issues are related to outages of different parts of the infrastructure, e.g. the network or the different services. In most cases failures on the infrastructure level do not directly influence the correctness of the development pipeline. It is obviously possible for a failure in the infrastructure to lead to an incorrect service being used, e.g. the test package manager instead of the production one. However unless other issues are present, i.e. the test package manager has packages of the same version but different content, it is unlikely that a failure in the infrastructure will allow artefacts to pass the development pipeline while they should not. A more likely result is that failures in the infrastructure lead to failures in the development pipeline thus affecting the ability of the pipeline to deliver the correct results consistently.

The types of issues mentioned can mostly be prevented by using the modern approaches to IT operations like configuration management and immutable servers to ensure that the state of the infrastructure is known, monitoring to ensure that those responsible for operations are notified of issues and standard operating procedures and potentially auto-remediation scripts to quickly resolve issues that arise.

It should be noted that it is not necessary, though extremely helpful, for the infrastructure to be robust in order to provide an accurate development pipeline. Tooling can, and probably should, be adapted to handle and correct for infrastructure failures as much as possible. However as one expects it is much easier to build a development pipeline on top of a robust infrastructure.

The final part of the discussion on the accuracy of the development pipeline deals with the relation between accuracy and the interaction of the tools and infrastructure. The main issue with interaction issues is that they are often hard to understand due to the, potentially large, number of components involved. Additionally certain interaction issues may only occur under specific circumstances like high load or specific times of the day / month / year, e.g. daylight savings or on leap days.

Because of the complexity it is important when building and maintaining a development pipeline to following the normal development process, i.e. using version control, (unit) testing, continuous integration, delivery or deployment, work item tracking and extensive testing etc. for all changes to the pipeline. This applies to the scripts and tools as well as the infrastructure. Additionally it is especially important to execute thorough regression testing for any change to the pipeline to ensure that a change to a single part does not negatively influence the correctness of the pipeline.

Finally in order to ensure that the correctness of the pipeline can be maintained, and improved if required, it is sensible to provide monitoring of the pipeline and the underlying infrastructure as well as ensuring that all parts of the pipeline are versioned, tools and infrastructure alike, and versions of all the parts are tracked for each input set.

nBuildKit release - V0.10.2

Monday, August 28, 2017 | Posted in nBuildKit

Version 0.10.2 of nBuildKit has been released.

This release made the following minor changes

  • Removed the custom tasks assembly from the nBuildKit.MsBuild.Actions NuGet package because those were never used from that package as they are also in the nBuildKit.MsBuild.Tasks NuGet package
  • Added a deployment step to get VCS information during the deploy process.
  • Fixed missing bootstrap steps in the test and deploy stages.

All the work items that have been closed for this release can be found on github in the milestone list for the 0.10.2 release.

Software development pipeline - Design introduction

Monday, August 21, 2017 | Posted in Delivering software DevOps Pipeline design Software development pipeline

In order to deliver new or improved software applications within the desired time span while maintaining or improving quality modern development moves towards shorter development and delivery cycles. This is often achieved through the use of agile processes and a development workflow including Continuous Integration, Continuous Delivery or even Continuous deployment.

One of the consequences of this desire to reduce the development cycle time on the development process is that more tasks in the development workflow have to be automated in order to reduce the time taken for the specific task. One way this automation can be achieved is by creating a development pipeline which takes the source code and moves it through a set of largely automatic transformations, e.g. compilation, testing, packaging and potentially deployment, to obtain a validated, and potentially deployed, application.

In order to configure a development pipeline, whether that is on-prem or in the cloud, for one or more development teams one will have to understand what the requirements are which are placed on the pipeline, which tooling is available to create the pipeline, whether the pipeline will be situated on-prem, in the cloud or a combination and how the pipeline will be assembled and managed. In this post series some of these issues will be discussed starting with the requirements or considerations that need to be given to the characteristics or behaviours of the development pipeline.

Prior to discussing what the considerations are for selecting tooling and infrastructure for a development pipeline, it is important to decide what elements are part of the pipeline and which elements are not. For the remainder of this post series the development pipeline is considered to consist of:

  • The scripts that are used during the different parts of the cycle, i.e. the build, test and release scripts.
  • The continuous integration system which is used to execute the different scripts.
  • The tools, like the compiler, test frameworks, etc.

Items like version control, package management, issue tracking, system monitoring, customer support and others are not included in the discussion. While these systems are essential in application development and deployment each of these systems spans a large enough area that it warrants a more thorough discussion than can be given in this post series.

Additionally, the following terms will be used throughout the series:

  • Input set - A collection of information that is provided to the pipeline to start the process of turning these generating the desired artefacts. An input set may consist of source code, e.g. in terms of a given commit to the source control system, files or documents, configuration values or any other combination of information which is required to create, validate and deploy the product artefacts. An input set should contain all the information needed to generate and deploy the product artefacts and each time a specific input set is provided to the pipeline exactly the same artefacts will be produced.
  • Executor - In general the development pipeline will be driven by a continuous integration system which itself consists of a controlling unit, which receives the tasks and distributes them, and a set of executing units which perform the actual computational tasks. In small systems the controlling unit may also perform the computational tasks, however even in this case a distinction can be made between the controlling and executing parts.

In order to start the selection of suitable components for a development pipeline it is important to know what the desirable properties of such a system are. The following ordered properties are thought to be the most important ones to consider.

  • Accurate: The pipeline must return the right outputs for a specific input, i.e it should report errors to the interested parties if there are any and report success and produce the desired artefacts if there are no errors.
  • Performance: The pipeline must push changes through the different stages fast in order to get results back to the interested parties as soon as possible.
  • Robustness: The pipeline must be able to cope with environmental changes, both expected and unexpected.
  • Flexibility: The pipeline must be easy to adapt to new products and tools so that it can be used without having to replace large parts each time a new product or tool needs to be included.

There are of course other desirable properties for a development pipeline like ease-of-use, maintainability, etc.. It is however considered that the aforementioned properties are the most critical ones. The linked posts provide additional reasons why each of these properties are important and how a software development pipeline can be designed to satisfy these important considerations.

Additional posts about this topic can be found via the Software development pipeline tag.

Edits

  • August 30th: Replaced the term correctness with the term accuracy because that is a better description of the combined concepts of consistency and correctness.
  • September 4th: Added the link to the post providing the high level description on how to achieve accuracy for a development pipeline

nBuildKit release - V0.10.0

Sunday, July 16, 2017 | Posted in nBuildKit

Version 0.10.0 and version 0.10.1 of nBuildKit have been released. The 0.10.0 release is the first release in the stabilization cycle. There are still a few major changes but the goal is to start stabilizing nBuildKit in preparation for the 1.0.0 release.

The highlights are

  • Renamed DirTest to DirTests. This is a breaking change.
  • Additional meta data can be provided with the build, test and deploy steps to indicate if a step should be inserted before or after another step in the sequence. Steps which have no insertion limits, i.e those steps that do not have the before or after meta data set, are inserted in document order. Steps that do define insertion limits are inserted according to the limits. Note that circular limits are not allowed as well as step sequences where there are no steps without insertion limits.
  • Deploying artefacts to an HTTP server
  • Initial implementation of the ability to verify GPG hash signatures. The current implementation requires that the GPG tooling is allowed through the firewall.

In addition to the major changes some additional build, test and deploy steps have also been defined:

All the work items that have been closed for this release can be found on github in the milestone list for releases 0.10.0 and 0.10.1.