Automating Legacy Desktop Applications With JRuby and Sikuli (With Notes)
Automating Legacy Desktop Applications With JRuby and Sikuli (With Notes)
Hi, I’m Chris Cha. I am a full-stack dev working in test automation and DevOps…
… at The Container Store, which specializes in organizing solutions like custom closets, and the joy of tidying up your space. Yes — we still use containers, but also
cloud and orchestration technologies to de ne a scalable, highly available future. Join us.
fi
Automating Legacy Desktop
Applications with JRuby and Sikuli
The title of my talk today is “Automating Legacy Desktop Applications with JRuby and Sikuli.”
Here is an example of my development environment. I have vim to edit text, SikuliIDE to manage screenshots, xfreerdp to remote into the Windows VM, and multiple
terminals to run vagrant and other chores. Within the Windows VM, I used Command Prompt to run individual tests with JRuby. Our goal is to automate a very battle-
tested, legacy application.
A universe for apple pie
• ed transforms
To start, I couldn’t just run Windows. I’m on a Mac. The app is Windows only. My go-to for VMs is VirtualBox. Vagrant provided an easy way to export from VirtualBox. I
found a Windows 10 Pro image from Microsoft and made changes to help with remoting. If I could run JRuby with Sikuli, I could win. But it was all only imaginative
strategy at this point. I guess my ambition was to learn vim, Vagrant, Sikuli, and JRuby+Java all at once, and assumed it would integrate in the end.
fi
Sneak ed in
Here is the transform script. It takes every .ed le in transforms/ directory and applies it to the Vagrant le. I wanted to be able to start from scratch at any time, create a
new folder, and be able to get to the current state. This forced me to keep changes isolated, and in retrospect helped manage complexity. The Vagrant le would have
otherwise been littered with commented half-experiments and distraction over syntax.
fi
fi
fi
A regex scalpel
• a — append
• w — write
• q — quit
From previous attempts, I knew xfreerdp would let me connect to Windows machines. I just needed to gure out how to do that with a Windows VM instead of a remote
host. Fortunately, many other people had solved this problem. Our rst transform looks for the line “con g.vm.network” and appends content to it. It almost looks like a
code block! Rubocop always reminds me to put a newline at the end, and it was important to do that with .ed les.
fi
fi
fi
fi
A fresh, free cup of Java
From earlier research, I knew the app in question pulled in its own JDK. I just needed to bootstrap the app, and AdoptOpenJDK had the easiest way to install. So we’ll
call this the “Is this even going to work” step. If anything, at least I had a Vagrant le for a nice Windows VM. And JRuby needed Java. I could use Command Prompt to
poke around inside the system.
fi
Like every software install ever
• Unzip it
• Set PATH
• Done
Here is the batch script we use to set up Java. We download, extract, set PATH, and done. This is the usual route for most binaries. I avoid using a package manager for
now, since I didn’t want issues there to confound attempts. Really liking the `powershell` to extend batch scripting capability.
Hello world
Since Vagrant prints things to console, we could invoke commands within the VM OS context and get useful results back. Here, we invoke `java` and ask its version. This
veri es the PATH is set. JRuby will know where to get its Java, and all will be well. More challenges await.
fi
Every block has its end
• / — go to the end of le
• . — done
• w — write
• q — quit
• (newline)
My second transform is a sanity check, just to make sure .ps1 les work. So powershell will execute as well, in case we need it later. But also a way to append to the end
of the Vagrant le instead of the middle. We pretty much use this approach to add on the rest of the changes.
fi
fi
fi
[SOLVED] with indirection
Here is the transform to the Vagrant le, to layer JDK installation on top. Every change, big or small, is run through a pattern of a transform and a script.
fi
Nice to meet you
• Not yet
Here we just layer on the Java veri cation script, where we check the Java version.
fi
I know you’re out there
• \\somewhere\out\there
So, we still need our app to test. It’s not available from the VM. We have to retrieve it from somewhere. Our source is a mapped network drive, so we need to map a
network drive. We gured out Java rst, since that was easier, but now we dig further into Windows. We need an app to run.
fi
fi
Rinse and repeat
• Apply transforms
• xfreerdp
Interlude: it might be nice to share the Vagrant dev loop. We retrieve a public image, apply transforms, and then start it up with Vagrant. We reboot too, and then try to
remote in with xfreerdp. We now have a Java development environment, but it can become so much more.
Reliable drive mapping
• Scheduled task
We need a strategy to map a network drive correctly, every time. We can do this with a scheduled task. We use the XML version by exporting a basic task, because this is
the only way to create a task with all conditions disabled. For example, on a laptop that is not plugged in, the scheduled task would not run because it would detect the
“server” (our machine) was on batteries. The crucial piece here is Powershell supports HEREDOC strings, so we can paste the XML directly.
XML out, XML in
Here is the rest of the scheduled task XML and script. We create the scheduled task with Powershell using the XML content. After `vagrant reload`, we should have a
mapped drive. All this to support one batch script, but a useful pattern for the future.
Right — cleanup
There was more to it, I guess: we make sure to delete our temporary XML le.
fi
Gimme your creds
In order to map the drive, we need credentials. These can be collected at Vagrant build time as a user prompt.
My preciousss
Now that we can acquire the app, we have to gure out how to run the darn thing. Each app is di erent and unique in its own way. In our case, the app uses Java Swing,
multiple windows, and a separate launcher dialog.
fi
ff
Mystery of mysteries
• .jnlp
The app was distributed as a .jnlp le. Opened as a text le, we could retrieve the startup arguments and try to reproduce it ourselves. We take the args and pass them to
Java, hoping for the best ( ngers crossed). Fortunately, it worked.
fi
fi
fi
Reliable drive mapping
• A sprinkling of indirection
To reliably map the network drive, on startup we’ll always remap it: remove the mapping, and then remap. Those magic arguments will be used forever. With the right
choices in the exported XML scheduled task, we’ll never have to worry about drive mapping again (at least for this VM). Oh, I forgot to mention — we are creating a
startup script here that will be run at startup, not actually mapping the drive at Vagrant runtime.
Enumerate for sanity
Here we map the drive. We also list out the les to verify we have indeed mapped the drive. Otherwise, we can’t get the app.
fi
How would we know otherwise
Some minor refactoring to help the devs. We hint at the format of credentials.
Linux, brie y
• Kiosk mode
• Xvfb on Linux
All of these steps have been to support eventual automation, which means we need to run things headless. We try a brief foray into Linux with Xvfb, a virtual display
framebu er. Without headless, the automation tests would always need to run on a user’s machine.
ff
fl
Copy-pasta
Here we set up the scheduled task to start up the app on VM startup, a la kiosk mode.
Launch app on connect
The rest of the script, where Windows launches the app when we remote in.
Step 7: kiosk mode
• 007-schedule-to-start-pos-on-login.ed
Start up application on remote login (kiosk
mode)
At this point, OS provisioning is complete. With the app launching on startup, we can test with Sikuli. Our rst test is to take a screenshot, since that will prove Xvfb can
“see” something, and that Sikuli can run with our installed version of Java. JRuby is used here, I think because executing SikuliIDE “eats” a version of JRuby .jar that
resides next to it. In a future slide, we’ll install JRuby from scratch and use Sikuli solely from code.
fi
Om nom nom
Note that we are again creating the script that will launch on startup, not actually executing the script. Remoting should launch the app and take a screenshot. Later, we
have the app launch on connect, not just at startup. The `jruby` used here is somehow available already. The relevant .rb les will now be covered.
fi
It’s just zeroes and ones
• javax.imageio.ImageIO in JRuby
To take a screenshot, we use native Java imaging library and Sikuli’s capture method to save an image of the screen. We use an instance of Sikuli’s Screen and Region
objects to get our bounding box, capture that, and write the bytes to a le.
fi
.ed language server
• Code is text
We also x a bug in this commit, where we forgot to add a q (quit) command to .ed. The .rb script just invokes our Screenshoot module to send the capture message,
specifying a lename.
fi
fi
Tests as text
We are nally at JRuby, the text approach. We can di tests as text, and organize a software architecture around a programming language.
fi
fi
ff
JRuby!!
Here we install JRuby from the Internet, and specify JAVA_HOME environment variable.
whereis jruby
• Wherever we like
The script to install JRuby, continued. Setting the PATH after we nd the latest JRuby installation directory.
fi
Vagrant le as task runner
As before, .ed les to transform the Vagrant le with a script to execute, and the JRuby setup and veri cation steps.
fi
fi
fi
fi
fi
fi
Punching through
This commit lets java.exe pass through Windows rewall. We also need to x an issue where the app starts up only the second time and subsequent times, and never the
rst.
fi
fi
fi
netsh
• 010-open-java-exe- rewall.ed
Run scripts/open-java-exe- rewall.bat
Here is the script to allow java through the rewall, and the transform to specify it to run.
fi
fi
fi
Promises, promises
We’re starting to develop a work ow. We’ll use SikuliIDE to manage images, the VM and JRuby to run tests, and xfreerdp to remote in and take screenshots of UI
components.
fl
fl
Tesseract (OCR) dependency
Sikuli wraps OpenCV, which uses Tesseract, which needs Visual C++ redistributable. So we install that.
Scheduled task boilerplate
Here is the scheduled task to run on login, which can launch the app, run tests, or anything needed.
on-login.bat
Continuing from previous, we copy to a common Startup folder. Subsequent runs will trigger from a scheduled task. We use both methods to get the desired kiosk
behavior, where the app always launches.
Arranging photos
SikuliIDE will render screenshots as images, but they are stored as text. Ruby lets us use strings as expressions, so we can have a kind of hierarchy where child UI
elements are beneath a parent string. The indentation is just a visual convention.
Stay awhile and sleep
As we are still in proof-of-concept phase, much of the code is procedural. We have very minimal boilerplate, however: just importing Java, specifying a path for .jars, the
SikuliX import statements, and setting verbose debugging. After that, we minimize the JRuby window and get to automation. First, we tell Sikuli about our global app
path. And then we sleep.
fi
Macros and games
Now we can start clicking and typing things. From this mess, we can break out the code into modules, classes, and concerns. From there, we can start to incorporate
tests.
Screen recording (and stop)
We start and stop screen recording as a test, and then run con g_pos.rb. This .bat script will run on login.
fi
Tell Sikuli: images here
We start to break out environment and con g code into its own module. We want this repo to potentially host screenshots of di erent apps to automate, but they all live
under a common bundle path.
fi
fi
ff
Each step pro ts
Same deal here, but two transforms. One to install Visual C++ redistributable, and the other to schedule a startup task.
fi
VLC for screen recording
We take a breather and try something di erent: screen recording. Now that we have automation in place, we can try to see if a video can be recorded.
ff
Installing VLC
• Polling install
Here is the invocation to start and stop screen recording. The latter is done with `ncat`. A lot of options are passed to VLC.
VLC setup script
• 013-install-vlc.ed
Runs scripts/install-vlc.bat
From the .ed le and the .exe, it looks like we bundle VLC in the repo. That saves a download.
fi
Failure to launch
VLC turned out to be too laggy for headless recording. And Xvfb renders Swing buttons in a strange way. So we need to look for alternatives. That’s future work. For this
commit, we x the app not launching the rst time.
fi
fi
Startup runbook
• Copying to Startup\
The bitsadmin changes supposedly make the JRuby download faster, but we replace it later with a much faster alternative. In the screenshot below, we copy a script to
the Startup\ folder.
Plain ol’ HTTP
• JDK 8
• Powershell Invoke-Request
It takes forever for bitsadmin to get its magic going, so we directly HTTP GET the thing. Otherwise, downloading Java and JRuby les can time out.
fi
Downgrading Java 11 to 8
Here we change out Java 11 for Java 8, and also use Powershell’s Invoke-WebRequest to fetch faster. The Java downgrade is due to ruby-maven gem not yet supporting
the newer version. Sikuli still works, so no big deal.
Fast downloads
Same here — we simplify from a polling bitsadmin to just HTTP fetching JRuby directly. Evidently, from Java 8 to Java 11, the dash-version was updated to the more
conformant dash-dash-version ag syntax.
fl
bundle install «gem»
Okay, now we can address gems. That means bundler, as well as our internal con g tool and linter. Once this project graduates to have additional contributors, we’ll need
a consistent coding style and use existing conventions.
fi
Wonderful, beautiful, shiny
Here’s what the Gem le looks like. Of course — a script driven by an .ed transform in order to get gems installed.
fi
fi
fl
Install bundler, install gems
• 014-install-bundler.ed
Runs scripts/install-bundler.bat
Now that we have a Windows VM and a running app that we can interact with using JRuby and Sikuli, we can start writing tests. Minitest comes with Ruby, so we’ll use
that.
The “better diff” for Windows
We install di utils, although for code di s I used Visual Studio Code from the local machine before making commits.
ff
ff
ff
Silent VLC, add minitest gem
We add the minitest gem in the Gem le. Minor nitpick to use a no-space install directory for VLC, and a bit of research to tell VLC to use a new install directory.
fi
GNU diff for Minitest
• 015-install-gnu-di .ed
Runs scripts/install-gnu-di .bat
The .ed script to set up the install. Next, we’ll refactor to modules and tests.
ff
ff
Start refactoring
With the proof-of-concept working, we start modularizing into components. The tests are UI-centered, and much like test automation in the web world with Selenium,
we’ll abstract interactions with Swing from user intent with the Page Object Pattern.
docs Explain running local tests
• Updating README
• Setting RUBYOPT
Just updating docs here for dev startup and setting a RUBYOPT environment variable for Minitest.
Reset dropdowns
Minor refactoring here to more reliably select defaults in the point-of-sale launcher dialog.
Loading in Sikuli
We split Sikuli API setup and a helper module. The boilerplate from the proof-of-concept, to initialize Sikuli, is moved into a sikuli_env.rb le, and the helper responds to
the minimize_window message.
fi
Debug (logging), global con g
In sikuli_test_environment.rb, we import useful Sikuli libs like Debug, set debug level, and introduce classes and methods to help with initializing the screenshot images
path. This is where we store our pictures of buttons, which in web tests like Selenium would de ne DOM elements through XPath.
fi
A (UI) tree of wonder
• SikuliTestEnvironment::Component as
parent of UI components
Here we specify that the images representing POS UI elements like buttons, dialogs, etc belong in a path o of the global image root path. The *.sikuli folder becomes a
subtype, a specialization, of a common parent class, which represents the parent folder. We also see the Components class, which will represent the generic parent of UI
components.
ff
Strings as a source of truth
• (Helper methods)
• Automatic mapping:
components[key] = value
Some private methods here to help with the methods in the previous slide. And we see the PosComponents subclass near the bottom, a specialization of Components.
In Ruby, the ease of thinking so uidly of object hierarchies with the `<` operator helps realize the idea of object specialization to provide speci c behavior. More of
PosComponents class in the next slide.
fl
fi
Simple specialization
• Delegates to parent
With all the setup code from before, PosComponents just inherits from Components and its constructor is just to specify PosApp. That’s all the child needs to do is
delegate to the parent, once the type (PosApp) is known. It’s kind of hardcoding, but seems okay since we’re setting up a known environment.
Minitest!
• Learning Minitest
Introducing Minitest. We want to test our setup code (and learn Minitest). Despite these tests, a bug will be xed later. But hey — we’re in a Windows VM from Mac OS,
running JRuby and Sikuli. The next bit is to connect that with testing the actual application.
fi
• spec_helper.rb
We introduce a spec_helper.rb as well, which imports the needed libs. This assumes JRuby is invoked with a library path.
“Checkpoint commit”
• Toward login/logout
Checkpoint commits. Those are fun. When you are grasping for the next set of features and end up roping in a bunch of changes you later need to summarize. It’s
nebulous in work like this, and I’m happy it just turned out to be a linear sequence of green lines (added lines). GitHub showing just addition after addition, rather than
switching between changes.
fi
A batch of pictures
We’re adding to our image repository — that is, our UI elements repository — with more screenshots. We’re getting closer to login, and the data setup needed.
• SikuliComponent::Component initialized
with hash (image- lename-to-image-path
mapping) from
SikuliTestEnvironment::PosComponent
Here we have con guration starting to be stored in a .yml le. Derived values are determined at the top, like “use the speci ed default if not speci ed from environment.”
We also see a new module, SikuliComponent, which itself has a Component class, and this accepts the hash from SikuliTestEnvironment::PosComponent.
fi
fi
fi
fi
fi
• Components in test respond to intentful
service calls
We’re logging in now. Since we automated the mapping between image name and image path, the key is now the same as an XPath in Selenium, and we’re navigating a
quote-unquote “DOM tree,” except instead of a webpage, it’s the desktop, and we assert the image (the UI component) is present somewhere on it.
fi
1 o’ the hardest problems in CS
Here’s the bug x, where I was o by a directory level. It wasn’t to /app but the base_bundle_path itself. That seemed to create a redundancy in classes, but I might have
let it be.
fi
fi
ff
Automating the login
Now we’re starting on the road to automating the app. The setup we saw in Minitest is realized in the `before do` block. This turns out to be costly, starting up the app for
each test. But it’s progress.
To /app or not to /app
Here we see the change to use /app after all. Maybe the bug x previously was to generalize the library, as we are closer to runtime state, and the test would “know” more
about initialization than the library. We bulk up spec_helper.rb with our in-house con g/.yml parser too.
fi
fi
fi
Gherkin for knowledge management
We had a requirement early on not to use Cucumber. But the inspiration of the design of the classes mirror an organization toward natural language. We do not end up
needing the SikuliFeature module, as feature classes can be inherited from Minitest.
• “Intent-ful” interface calls
Minor refactoring here, so that named components do not have to initialize themselves but defer to parent. Now we’re just left with intent calls simply named, and
possibly various private helper methods.
• Primitive actions hidden behind an interface
Now that we can log in, we can start automation on a basic work ow in POS.
fl
• Unused
Here’s the SikuliFeature module that we use Minitest’s feature inheritance for instead.
Here, we are just starting out on the boilerplate for the initial test.
Here is a test where we want to try clicking all the buttons and ensure they act as expected. The di erence here is to log out from the main window instead of from a
dialog.
ff
Updating our spec_helper.rb to include the (unused) feature module.
Interlude II
Interlude 2. There are some PRs that never got merged. Those are from trying out Dubray’s State-Actor-Model (SAM) pattern, but I kept getting stack over ow in my state
machines. Getting a Ruby version of rocket launch working is a de nite TODO. But here we just start adding tests. The framework is set up now for us to add lots of
tests.
fl
fi
fl
How to exit a Java app
Some notes to self in the form of a README: how to exit from java.exe. Removing the screenshot function, since by this point the app launches reliably enough to debug
other things.
• Hide launching window
One more stab at VLC, but there are some quirks. The Xvfb shows the UI with exaggerated Swing buttons, so the screen caps we took of the components are not
recognized. We end up having to hack around that, and headless is tabled for the future.
If anything, by this point we’re pretty much writing page objects like in Selenium. Reused “constants” — what would be XPaths — are image names. In this example,
CHECKBOX_NO_DEVICES is used more than once, so we DRY it out and add it as a constant up top.
The PoC code, the messy stu we had before, has been moved into components and methods.
ff
Here we use runtime exceptions as pre- and post-conditions. Having methods fail fast is something I brought into my Selenium web testing. As those assertions live in
the methods as contracts, we nd the specs don’t always have to check for them. In Cucumber, the Then steps can be shorter and re ect user capability instead of
speci c UI elements. Borrowing from Node.js, we use an index.rb le to de ne the UI components that are themselves separate classes. Did y’all know a separate le
that speci es an existing module automatically gets included for namespace purposes? That’s nifty and keeps each le smaller.
fi
fi
fi
fi
fi
fi
fl
fi
• Caller never invokes primitives, only sends
messages of intent
We have a process called a “take,” and the UI elements reused are de ned as constants. Each public method uses the primitives we have built up or borrowed from
Sikuli, including click, type, wait_for, and click_ nd. Private methods for tracking state, like line item count, are managed through messages like
increment_line_item_count. The Page Object is TakeOrder, the interface re ects intent (add, accept), and the consumer never directly clicks or types anything.
fi
fi
fl
• More intents: apply_, asked_, delete_,
_cancel
Here are some more public methods from TakeOrder, in service to tests and expectations.
Continuing with the methods to use in tests, before we look at the private methods.
We just have only a couple private methods to track line item count state. Another component for the “top bar” UI, just for logging out. We just click the logout button,
which to Sikuli is the image of the logout button; once recognized, it sends a click.
The helper methods used in components, de ned in sikuli_component.rb.
fi
Here is a simpler component, at least to esh out one of the dialogs. I remember this dialog has several more buttons.
fl
Oh, the screenshot capability gets moved into a helper method.
Thanks to Ruby’s ability to execute shell commands, we can also encapsulate screen recording controls into the sikuli_helper.rb module. In sikuli_test_environment.rb, we
also add logging.
Some more bug xes. We also see the “-I (eye) lib” ag, so our require statements don’t need to all be relative. In future work, it would be nice to provide a Rails
dashboard that could adjust on-login.bat to run one or more tests.
fi
fl
• Front-loading expensive setup
http://ruslanledesma.com/2016/07/21/
before-all-after-all-in-minitest.html
In Minitest, there is not a “before all” but the Internet had a solution: just call them before the “before do” block. We see the initial steps to start up the app are pretty
expensive, so we only want to log in once. Then every test per scenario can be done in an authenticated context.
Refactoring placeholder code with actions.
More tests…
• English-like behavior calls
Now we’re starting to see the bene ts of the interface: the tests nearly read like step defs.
fi
And nally, adjustments to the spec_helper to include those new components. With that, we have successfully shown a Windows VM running our Java Swing app,
integrated JRuby with Sikuli, and used Minitest to verify behavior. Not a bad run with free and open-source software!
fi
Thank-you