If you are using python packages like xmltodict or yaml, here is something to be aware of

If you are using python packages like xmltodict or yaml to write and read your own XML and yaml files, you probably don’t need to know this. But if you are reading someone else’s files, here is something to be aware of.

This week I had to process an XML files in python. No problem, I thought, I’ll use a python package like xmltodict to translate the XML into a dictionary variable. Then I could edit it and print out a new file with the changes. Sounds easy!

Well, first off, it wasn’t too easy: the nesting was horrendous. However, with some help from VS Code, I was able to power through and get the value I want.

Here’s where I got burned. I wanted to change the text in the XML file, so I had a statement like this to read it


mytext = python_dict["graphml"]["graph"]["node"][nodecount]["graph"]["node"][i]["data"]["y:ShapeNode"]["y:NodeLabel"]["#text"]

and then a simple statement like this to change it to lower text:


python_dict["graphml"]["graph"]["node"][nodecount]["graph"]["node"][i]["data"]["y:ShapeNode"]["y:NodeLabel"]["#text"] = mytext.lower()

Very basic.

Now this particular file is an XML file that has a graphml extension, which allows an editor like YED to read it. YED can read the original file, but it turns out xmltodict writes the file in such a way that the YED editor can no longer see the text. I don’t know why.

I spent hours working on it until I finally gave up. I wrote a much dumber program that read through the graphml file a line at a time and changed it the way I wanted to. No fancy packages involved. Dumb but it worked.

This is the second time this year a package has given me problems. In late January I wrote some code to parse yaml files for a client to extract information for them and to produce a report. Again, there is a package to do that: yaml. Which is….good…except when the yaml it is processing it is poorly written. Which this yaml was.

Again, I spent hours linting the yaml and in some cases having to forgo certain files because they were poorly constructed. What should have been easy — read the yaml file, transform it, write a new yaml file — was instead very difficult.

And that’s often the problem with yaml files and XML and JSON files: they are often handcrafted and inconsistent. They MAY be good enough for whatever tool is ingesting them, but not good enough for the packages you want to use to process them.

I think those packages are great if you are making the input files. But if you are processing someone elses, caveat emptor (caveat programmer?).

A trick to resolve issues with YAML files for Kubernetes

I was going through this exercise for Using Calico network policies to block traffic when I thought that instead of deploying the webserver image using this command:

kubectl run webserver --image=k8s.gcr.io/echoserver:1.10 --replicas=3

I would create a yaml file to deploy the webserver instead. Unfortunately, there was something about my yaml file that preventing things from working. That’s when I came across this trick.

  • Step 1: deploy the web server using the kubectl run command.
  • Step 2:  run the following command to get the YAML back for the deployment


kubectl get deployment webserver --output yaml > webserver.yaml

  • Step 3: edit the webserver.yaml file to remove extra lines. For me, I was able to remove:
    •  the status section
    • the annotations section
    • the strategy section
    • etc.

And just keep the following lines (note, note formatted properly):


apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
run: webserver
name: webserver
namespace: default
spec:
replicas: 3
selector:
matchLabels:
run: webserver
template:
metadata:
labels:
run: webserver
spec:
containers:
- image: k8s.gcr.io/echoserver:1.10
name: webserver

Now, you do not have to edit the file. But I think this is cleaner than the full version that comes back.

So you can delete the deployment that was the result of the command line and instead build future deployments using the yaml file.