Adding --list-config-files to Mozharness

In First Assignment, I touched on how Mozharness can have multiple configs. You can specify on any script say:

./scripts/some_script --cfg path/to/config1 --cfg path/to/config2 [etc..]

The way this works is pretty simple, we look for those configs and see if it is a URL or a local path. We then validate that they exist. Finally, we update what ultimately becomes self.config by calling .update() on self.config with each config given (in order!).

This is nice and allows for some powerful configuration driven scripts. However, while porting desktop builds from Buildbot to Mozharness I have come across 2 features that would be nice to have:

  1. the ability to have transparency of which config's keys/values self.config is using.
  2. allow for a specific part of a config file to be added to self.config while dictating the config file hierarchy per script run.

The motivation for these came from the realiziation that Firefox desktop builds can not have just one config file per build type. We would need hundreds of config files! This is because, if you think about it, we have multiple platforms, repo branches, special variants (asan, debug, etc), and build pools for each build. Every permutation of these has its own config's keys/values. Take a peak at buildbot-configs to see how hard this is to manage.

I needed a solution that scales and is easy to understand. Here is what I've done:

First, before I dive into the 2 features I mentioned above, let's recap from my last post. Rather than passing each config explicitly, I gave the script some 'intelligence' when it came to figuring out which scripts to use. There is no black magic: when it runs, the script will tell you what configs it is using and how it is using it:

for example, specifying:

--custom-build-variant debug

will tell the script to look for, depending on your platform and arch, the correct debug config. eg:

configs/builds  
└── releng_sub_linux_configs
    ├── 32_debug.py

You can also just explicitly give the path to the build variant config file. For a list of valid shortnames, see '--custom-build-variant --help'. Either way, this adds another config file:

config = {  
    'default_actions': [
        'clobber',
        'pull',
        'setup-mock',
        'build',
        'generate-build-props',
        # 'generate-build-stats', debug skips this action
        'symbols',
        'packages',
        'upload',
        'sendchanges',
        # 'pretty-names', debug skips this action
        # 'check-l10n', debug skips this action
        'check-test',
        'update', # decided by query_is_nightly()
        'enable-ccache',
    ],
    'platform': 'linux-debug',
    'purge_minsize': 14,
    ...etc

Next up, the branch:

--branch mozilla-central

this does 2 things. 1) It sets the branch we are using (self.config['branch']) and 2) it adds another config file to the script:

configs/builds/branch_specifics.py  

This is a config file of dicts:

config = {  
    "mozilla-central": {
        "update_channel": "nightly",
        "create_snippets": True,
        "create_partial": True,
        "graph_server_branch_name": "Firefox",
    },
    "cypress": {
    ...etc
    }
}

The script will see if the branch name given matches any branch keys in this config. If it does, it will add those keys/values to self.config.

Specifying the build pool behaves similarly:

--build-pool staging

adds the cfg file:

configs/builds/build_pool_specifics.py  

which adds the dict under key 'staging' to self.config:

config = {  
    "staging": {
        'balrog_api_root': 'https://aus4-admin-dev.allizom.org',
        'aus2_ssh_key': 'ffxbld_dsa',
        'aus2_user': 'ffxbld',
        'aus2_host': 'dev-stage01.srv.releng.scl3.mozilla.com',
        'stage_server': 'dev-stage01.srv.releng.scl3.mozilla.com',
        # staging/preproduction we should use MozillaTest
        # but in production we let the self.branch decide via
        # self._query_graph_server_branch_name()
        "graph_server_branch_name": "MozillaTest",
    },
    "preproduction": {
    ...etc

put together, we can do something like:

./scripts/fx_desktop_build.py --cfg configs/builds/releng_base_linux_64_builds.py --custom-build-variant asan --branch mozilla-central --build-pool staging

Which is kind of cool. To do this though, requires some 'user friendly' touches. For example, the order in which these arguments come shouldn't matter: the hierarchy of configs should be consistent. For desktop builds, the consistency goes: from highest precedence to lowest: build-pool, branch, custom-build-variant, and finally the base --cfg [file] passed (the platform and arch).

Another friendly touch needed was to be able to specify how this heirarchy effects self.config. As in, which config file do the keys/values come from?

Finally, you may have noticed that for branch and build-pool, I only use one config file for each: 'branchspecific.py' and 'buildpool_specifics.py'. I don't care about any branch or pool in those files that is not specified so we should not add that to self.config.

This ties in to my two features I wanted to add to Mozharness.

(recap):

  1. the ability to have transparency of which config's keys/values self.config is using.
  2. allow for a specific part of a config file to be added to self.config while dictating the config file hierarchy per script run.

So how did I achieve this? Let's look at some code snippets:

I will look at three sections of code.

  • how the script allows users to specify options like '--branch' and get 'branch_specifics.py' added to my list of config files.
  • how we can specify the options in any order while persisting hierarchy. Also how we can grab only part of a config file
  • how we can list out each config file used and which keys/values are used in making self.config

*Note, the first one of these I made for a convenience to devs/users who will use this script. You can always just explicitly specify '--cfg path/to/cfg' for each config if you wish.

  1. in fxdesktopbuilds.py I wrote an OutputParser helper class:
class FxBuildOptionParser(object):  
    platform = None
    bits = None
    build_variants = {
        'asan': 'builds/releng_sub_%s_configs/%s_asan.py',
        'debug': 'builds/releng_sub_%s_configs/%s_debug.py',
        'asan-and-debug': 'builds/releng_sub_%s_configs/%s_asan_and_debug.py',
        'stat-and-debug': 'builds/releng_sub_%s_configs/%s_stat_and_debug.py',
    }
    build_pools = {
        'staging': 'builds/build_pool_specifics.py',
        'preproduction': 'builds/build_pool_specifics.py',
        'production': 'builds/build_pool_specifics.py',
    }
    branch_cfg_file = 'builds/branch_specifics.py'

    @classmethod
    def set_build_branch(cls, option, opt, value, parser):
        # first let's add the branch_specific file where there may be branch
        # specific keys/values. Then let's store the branch name we are using
        parser.values.config_files.append(cls.branch_cfg_file)
        setattr(parser.values, option.dest, value) # the branch name

the above only shows the method for resolving the branch config (build-pool and custom-build-variant live in this class as well). Here is how --branch knows where to look

class FxDesktopBuild(BuildingMixin, MercurialScript, object):  
    config_options = [
        [['--branch'], {
            "action": "callback",
            "callback": FxBuildOptionParser.set_build_branch,
            "type": "string",
            "dest": "branch",
            "help": "This sets the branch we will be building this for. "
                    "If this branch is in branch_specifics.py, update our "
                    "config with specific keys/values from that. See "
                    "%s for possibilites" % (
                        FxBuildOptionParser.branch_cfg_file,
                    )}
         ],

         ...etc

2.In mozharness/base/config.py I extracted out of parseargs() where we find the config files and add their dicts. This is put into a seperate method: getcfgsfromfiles()

# append opt_config to allow them to overwrite previous configs
all_config_files = options.config_files + options.opt_config_files  
all_cfg_files_and_dicts = self.get_cfgs_from_files(  
    all_config_files, parser=options
)

getcfgsfrom_files():

def get_cfgs_from_files(self, all_config_files, parser):  
    """ returns a dict from a given list of config files.

    this method can be overwritten in a subclassed BaseConfig
    """
    # this is what we will return. It will represent each config
    # file name and its assoctiated dict
    # eg ('builds/branch_specifics.py', {'foo': 'bar'})
    all_cfg_files_and_dicts = []
    for cf in all_config_files:
        try:
            if '://' in cf: # config file is an url
                file_name = os.path.basename(cf)
                file_path = os.path.join(os.getcwd(), file_name)
                download_config_file(cf, file_path)
                all_cfg_files_and_dicts.append(
                    (file_path, parse_config_file(file_path))
                )
            else:
                all_cfg_files_and_dicts.append((cf, parse_config_file(cf)))
        except Exception:
            if cf in parser.opt_config_files:
                print(
                    "WARNING: optional config file not found %s" % cf
                )
            else:
                raise
    return all_cfg_files_and_dicts

This is largely the same as what it was before but now you may notice I am collecting filenames and their assoctiated dicts in a list of tuples. This is on purpose so I can implement the ability to list configs and their keys/values used (I'll get to that later).

By extracting this out, I can now do something fun like subclass BaseConfig and overwrite getcfgsfrom_files():

back in fxdesktopbuild.py see the full method for more inline comments

class FxBuildConfig(BaseConfig):

def get_cfgs_from_files(self, all_config_files, parser):  
    # overrided from BaseConfig
    # so, let's first assign the configs that hold a known position of
    # importance (1 through 3)
    for i, cf in enumerate(all_config_files):
        if parser.build_pool:
            if cf == FxBuildOptionParser.build_pools[parser.build_pool]:
                pool_cfg_file = all_config_files[i]
        if cf == FxBuildOptionParser.branch_cfg_file:
            branch_cfg_file = all_config_files[i]
        if cf == parser.build_variant:
            variant_cfg_file = all_config_files[i]

    # now remove these from the list if there was any
    # we couldn't pop() these in the above loop as mutating a list while
    # iterating through it causes spurious results :)
    for cf in [pool_cfg_file, branch_cfg_file, variant_cfg_file]:
        if cf:
            all_config_files.remove(cf)

    # now let's update config with the remaining config files
    for cf in all_config_files:
        all_config_dicts.append((cf, parse_config_file(cf)))

    # stack variant, branch, and pool cfg files on top of that,
    # if they are present, in that order
    if variant_cfg_file:
        # take the whole config
        all_config_dicts.append(
            (variant_cfg_file, parse_config_file(variant_cfg_file))
        )
    if branch_cfg_file:
        # take only the specific branch, if present
        branch_configs = parse_config_file(branch_cfg_file)
        if branch_configs.get(parser.branch or ""):
            print(
                'Branch found in file: "builds/branch_specifics.py". '
                'Updating self.config with keys/values under '
                'branch: "%s".' % (parser.branch,)
            )
            all_config_dicts.append(
                (branch_cfg_file, branch_configs[parser.branch])
            )
    if pool_cfg_file:
        # largely the same logic as adding branch_cfg
    return all_config_dicts
  1. finally, I added an option '--list-config-files' in Mozharness that any script can use:
self.config_parser.add_option(  
    "--list-config-files", action="store_true",
    dest="list_config_files",
    help="Displays what config files are used and how their "
         "heirarchy dictates self.config."
)
def list_config_files(self, cfgs=None):  
    # go through each config_file. We will start with the lowest and print
    # its keys/values that are being used in self.config. If any
    # keys/values are present in a config file with a higher precedence,
    # ignore those.
    if not cfgs:
        cfgs = []
    print "Total config files: %d" % (len(cfgs))
    if len(cfgs):
        print "Config files being used from lowest precedence to highest:"
        print "====================================================="
    for i, (lower_file, lower_dict) in enumerate(cfgs):
        unique_keys = set(lower_dict.keys())
        unique_dict = {}
        # iterate through the lower_dicts remaining 'higher' cfgs
        remaining_cfgs = cfgs[slice(i + 1, len(cfgs))]
        for ii, (higher_file, higher_dict) in enumerate(remaining_cfgs):
            # now only keep keys/values that are not overwritten by a
            # higher config
            unique_keys = unique_keys.difference(set(higher_dict.keys()))
        # unique_dict we know now has only keys/values that are unique to
        # this config file.
        unique_dict = {k: lower_dict[k] for k in unique_keys}
        print "Config File %d: %s" % (i + 1, lower_file)
        # let's do some sorting and formating so the dicts are parsable
        max_key_len = max(len(key) for key in unique_dict.keys())
        for key, value in sorted(unique_dict.iteritems()):
            # pretty print format for dict
            cfg_format = " %%s%%%ds %%s" % (max_key_len - len(key) + 2,)
            print cfg_format % (key, '=', value)
        print "====================================================="
    # finally exit since we only wish to see how the configs are layed out
    raise SystemExit(0)

Again the actual method has proper doc strings and more comments.

This can all be tried out. Simply clone from here: https://github.com/lundjordan/mozharness/tree/fx-desktop-builds
and try out a few '--list-config-files' against the script. This option causes the script to end before actually running any actions (akin to 'list-actions') so you can hammer away at my script with various options against it. Or try it against any of other mozharness scripts.

WARNING: fxdesktopbuilds.py is actively under development. Aside from it breaking from time to time, it is only made for our continuous integration system. There is not a config(s) that is friendly for local development. So only run this if you also supply '--list-config-files'

Notice each one gives you output the keys/values that are actually used by self.config: e.g.: If relengbaselinux64builds.py has a key thats value differs from a key in buildpool config, it won't be shown (buildpool has precedence).

examples you can try:

./scripts/fx_desktop_build.py --cfg configs/builds/releng_base_linux_64_builds.py --custom-build-variant asan --branch mozilla-central --build-pool staging --list-config-files
./scripts/fx_desktop_build.py --cfg configs/builds/releng_base_linux_32_builds.py --custom-build-variant debug --build-pool preproduction --list-config-files
./scripts/fx_desktop_build.py --cfg configs/builds/releng_base_linux_64_builds.py --custom-build-variant asan --list-config-files
./scripts/fx_desktop_build.py --list-config-files

example output (large parts of this is omitted for brevity):

Branch found in file: "builds/branch_specifics.py". Updating self.config with keys/values under branch: "mozilla-central".  
Build pool config found in file: "builds/build_pool_specifics.py". Updating self.config with keys/values under build pool: "staging".  
Total config files: 4  
Config files being used from lowest precedence to highest:  
=====================================================
Config File 1: configs/builds/releng_base_linux_64_builds.py  
 buildbot_json_path        = buildprops.json
 clobberer_url             = http://clobberer.pvt.build.mozilla.org/index.php
 default_vcs               = hgtool
 do_pretty_name_l10n_check = True
 enable_ccache             = True
 enable_count_ctors        = True
 enable_package_tests      = True
 exes                      = {'buildbot': '/tools/buildbot/bin/buildbot'}
 graph_branch              = MozillaTest
 graph_selector            = /server/collect.cgi
 graph_server              = graphs.allizom.org
 hgtool_base_bundle_urls   = ['http://ftp.mozilla.org/pub/mozilla.org/firefox/bundles']
 hgtool_base_mirror_urls   = ['http://hg-internal.dmz.scl3.mozilla.com']
 latest_mar_dir            = /pub/mozilla.org/firefox/nightly/latest-%(branch)s
 mock_mozilla_dir          = /builds/mock_mozilla
 use_mock                  = True
 vcs_share_base            = /builds/hg-shared
 etc...
=====================================================
Config File 2: /Users/jlund/devel/mozilla/dirtyRepos/mozharness/scripts/../configs/builds/releng_sub_linux_configs/64_asan.py  
 base_name                               = Linux x86-64 %(branch)s asan
 default_actions                         = ['clobber', 'pull', 'setup-mock', 'build', 'generate-build-props', 'symbols', 'packages', 'upload', 'sendchanges', 'check-test', 'update', 'enable-ccache']
 enable_signing                          = False
 enable_talos_sendchange                 = False
 platform                                = linux64-asan
etc ...  
=====================================================
Config File 3: builds/branch_specifics.py  
 create_partial  = True
 create_snippets = True
 update_channel  = nightly
=====================================================
Config File 4: builds/build_pool_specifics.py  
 aus2_host                = dev-stage01.srv.releng.scl3.mozilla.com
 aus2_ssh_key             = ffxbld_dsa
 aus2_user                = ffxbld
 balrog_api_root          = https://aus4-admin-dev.allizom.org
 download_base_url        = http://dev-stage01.srv.releng.scl3.mozilla.com/pub/mozilla.org/firefox/nightly
 graph_server_branch_name = MozillaTest
 sendchange_masters       = ['dev-master01.build.scl1.mozilla.com:8038']
 stage_server             = dev-stage01.srv.releng.scl3.mozilla.com
=====================================================