Southern African Large Telescope

Prime Focus Imaging Spectrograph

Roadmap to the PFIS Control System Software

SALT-3140AE0026

Jeffrey W Percival

Modification Record
Version Date Comment
1.0 01-Jul-2004 initial release
1.1 15-Feb-2005 delete the init, enable, and disable UMI cmds
1.2 23-Aug-2005 add wp-move to list of diagrams
1.3 05-Oct-2005 remove reference to Configurator
  1. Introduction
    1. What is the PFIS Control System Model?
    2. Why Are There So Many VIs?
    3. How Do I Maintain Them?
    4. How Do I Add To Them?
  2. Statement of the Problem: Concerns and Constraints
    1. PFIS Mode Complexity
    2. LabVIEW Wiring Complexity
    3. Platform Dependence (Motion Control is Windows Only)
    4. PFIS Hardware Dependence
  3. Platform and Hardware Independence
    1. PXI API
    2. Simulation for Mac, Linux, Windows
    3. Flex VIs for Windows
  4. State Machines
    1. State Machine (Bubble) Diagram
    2. State Table
    3. Transitions
    4. Transition Server
    5. State Machine VI (State Server)
    6. Initialization & Recovery
  5. Global Variables: Low Level and High Level
  6. Example: Adding a New Mechanism
    1. Create Standard Typedefs
    2. Create Globals
    3. Create Transitions
    4. Mid-level - State Detection, Transition Server (operate)
    5. High-level - State Server
  7. PFIS State Diagrams: Low Level and High Level
  8. PFIS Error Codes

1. Introduction

We start out by asking a few simple questions, answering them vaguely, and then moving on into more detail.

1.1 What is the PFIS Control System Model?

The PFIS Control System (PCS) model is to implement each mechanism and each higher-level controller as a table driven state machine. We describe this in detail below.

1.2 Why Are There So Many VIs?

We decided to produce a large number of simple VIs rather than a small number of complicated VIs. Each mechanism is implemented as a suite of many small, highly-specific VIs. Each mechanism replicates the same pattern of VIs. When you learn one mechanism, you learn them all.

Here’s an analogy. Skip this if you don’t like analogies. Imagine a table set for a fancy dinner, say, 8 guests (each one corresponding to a PFIS mechanism). Each guest commands a pretty large number of individual items: a chair, a dinner plate, salad plate, wine glass, water glass, saucer, coffee cup, soup spoon, tea spoon, salad fork, dinner fork, knife, butter knife, napkin.  For a party of 8, this involves 112 items. But these are not randomly chosen items. They are 8 instances of only 14 items. Once you learn what the soup spoon does, you know what it does for every guest. You learn it once, and apply it 8 times. The PFIS control system comprises hundreds of VIs, but you need to understand only about a tenth of them to make sense of the whole system.

1.3 How Do I Maintain Them?

The state diagrams below provide a roadmap of the PFIS control system.  To change a behavior, edit the transition VI (see below). To reorder events, update the state table.

1.4 How Do I Add to Them?

We provide an example below in which we add a new mechanism to the PFIS control system.

2 Statement of the Problem: Concerns and Constraints

We now present the concerns that drove us toward the chosen model of implementation.

2.1 PFIS Mode Complexity

PFIS is a complicated instrument. If you simply count up the choices as you proceed along the optical axis, you get to a combinatoric explosion pretty fast:

This gives 216 instrument states, not counting choices for 40 slitmasks, 6 gratings, 30 filters, and 130 articulation angles.

We mitigate this complexity by distinguishing mutually exclusive states (some choices disappear when you make other choices, e.g. grating vs. Fabry-Perot spectroscopy), but we are still left with the internal complexity of managing 16 pneumatic actuators with 20 relay-controlled valves and 8 motorized axes.

Strategy: Table-Driven State Machines (see below).

2.2 LabVIEW Wiring Complexity

LabVIEW has the virtue of being a well-designed graphical programming language. Data flow on wires, and modules (Virtual Instruments or VIs) operate on the data. It looks like a circuit schematic (like this). The problem is scalability. It is very easy to lay down so many VIs interconnected by so many wires, that it can become almost literally impossible to understand (like this).

How do we implement a complex control system without running into this LabVIEW complexity wall?

Strategy: use many small, highly-specific VIs rather than fewer, more complex ones.

2.3 Platform Dependence (Motion Control is Windows Only)

LabVIEW runs (in its most basic form) on Windows PCs, Linux PCs, and Macintoshes.  National Instruments supports our motion control hardware, however, only on Windows. We were concerned about availability of “seats” for programming.  One is first restricted to the set of machines that have LabVIEW licenses, and one is further restricted to programming motion control on the subset of licensed machines that happen to be Windows boxes.

Strategy: abstract away the hardware with a “thin” hardware compatibility API.

In other words: the vast majority of PFIS LabVIEW VIs are written in vanilla LabVIEW platform-neutral code. This layer makes motion-control calls to a very specific, very small set of PFIS-defined "access points". On the Mac side, these access points consist of simulations; on the dark side, they represent actual calls to Motion Control VIs. So there are 3 parts to the PFIS Control System distribution: the platform-neutral code, the simulation base, and the dark base. You run the higher layer with either the simulation base, or the dark base.

2.4 PFIS Hardware Dependence

Even if motion control were not restricted to Windows boxes, and were supported on, say, Macintoshes, we still did not want to be bound to the computer in the lab that was connected to the actual PFIS hardware interface cards. We did not want to require the presence of the hardware in order to advance the software, so that we could avoid conflicts in the lab during testing, write software at home, and so on. Imagine a developer needing the machine to fix bugs, but the instrument scientist needing the machine to run a calibration.

Strategy: add simulation to the non-Windows hardware compatibility layer.

3 Platform and Hardware Independence

Our goal is to produce a control system that uses LabVIEW in its most portable form. We need to design a system that can be coded on any LabVIEW-supported system (Mac, Linux, Windows) without requiring the PFIS hardware to be present. It needs to be deployed to the target Windows box that is connected to the PFIS electronics with a reasonable expectation that any improvements will work "out of the box".

3.1 PXI API

We create this environment by first identifying the smallest set of capabilities needed to operate the hardware. The actual interface that the software sees comprises one digital input board, one analog input board, and two motion control boards. The programmer normally uses National Instruments VIs to control these boards. We decided that these four boards are the right place for the "thin" hardware API. We designed four VIs, with the following access points:

These four VIs suffice to provide all the needed management and control of PFIS hardware. All PFIS VIs access the hardware through these four VIs. These four VIs represent the "eye of the needle" through which all control flows.

3.2 Simulation for Mac, Linux, Windows

When you don't have National Instruments' Flex Motion control software, or if the PFIS hardware is not connected to your computer, you arrange to have the simulation version of the four access-point VIs. These VIs do not use any of NI's proprietary VIs. Rather, there is a simple simulation. For example, asserting the digital output "articulation detente remove" should result in a change to the digital input (sensor) "articulation detente removed". Actions have consequences, and the simulation layer connects the actions to their consequences.

3.3 Flex VIs for Windows

In the real PFIS world, there are four VIs that share the same name as the sim VIs, but which are different internally. These "real" VIs accept the same inputs as the sim VIs, produce the same outputs, and have exactly the same connector layout. They differ from the sim VIs in that they use actual National Instruments Flex Motion VIs to interact with the boards in the backplane.

4 State Machines

Now, having eliminated the hardware from consideration, we turn to controlling complexity in software. We chose to implement the PFIS Control System as a collection of table-driven state machines. This section describes what those are, and how we use them. We will use the slitmask as the example mechanism.

4.1 State Machine (Bubble) Diagram

The operation of the slitmask mechanism can be pictured in the form of a state diagram. This diagram shows actions, which we call transitions, and states, which describe the static mechanical configurations the mechanism can achieve.

4.2 State Table

The state diagram can form the basis of actual control software in the following way. Looking at the slitmask state diagram, imagine you are in State 1 (Home, with no slitmask). Imagine also that you want to be in State 5 (Inserted). What is the next action to take? The state diagram indicates that it is to perform the action "T1 - Select", which happens to move the elevator to a slitmask station. You are now in State 2. Look again at the diagram. To get from State 2 to State 5, perform the action "T2 - Fetch", which uses a pneumatic actuator to draw the slitmask into the elevator. You are now in State 3. You repeat as needed, each time reading another action from the state diagram, and you stop when you get where you want to be, State 5.

How does this help us programmatically? Imagine a state table, shown on the state diagram as a table and here as a LabVIEW Type Definition, whose rows are indexed by what state you are in, and whose columns are indexed by what your final destination state is. Each cell in the table contains two pieces of information: what action to take next, and what state that takes you to. Summarizing the paragraph above, you do this:

Note that reaching your desired state means ending up on the diagonal of the state table. We handle diagonal elements in two ways. In most cases, when you are where you want to be, you are happy to do nothing. We use "T0" to represent the do-nothing action. We actually have a VI that does nothing but beep. (S5,S5) indexes the action T0 (do nothing), resulting in the state S5 (no change).

Look, however, at (S3,S3). This indexes the "T1 - Select" action. This makes sense here. Imagine you've gone to slot 20 to fetch a slitmask, and then you change your mind and want to go to slot 32. The slot number is not part of the state, so you can't assume a do-nothing.

We choose how to handle each diagonal element on a case-by-case basis, according to what makes sense for the mechanism.

4.3 Transitions

After building the state table, we must create a self-contained VI for each of the transitions. For example, slitmask-t2.vi is responsible for executing the T2-Fetch transition. We chose to break up each transition into three steps:

  1. slitmask-t2-ok.vi: check to see if it's OK to do it (e.g. check interlocks)
  2. slitmask-t2.vi: do it
  3. slitmask-t2-wait.vi: wait until it's done

Note the simple linear wiring on the slitmask-t2.vi diagram. We use LabVIEW's error handling techniques to wave off actions if errors occur.

Note: some transitions do not have interlocks. Tough. Make the "OK" VI anyway, and make it a no-op. Pass the input error to the output. This is part of the discipline. Every transition has this triplet of VIs. Not only does this make for uniformity of appearance, it also has a much more important function: it gives you a place to put an interlock (or other check) if you decide you need one.

Example: suppose you have no interlock on opening the shutter. The shutter-t1-ok.vi is empty (a no-op). It's there in the hierarchy, it's being called, it just does nothing. Now suppose you decide you'd like to veto the shutter opening if, say, the pneumatic air pressure is too low. No hunting around to find a place to do this; no breaking of any wires. You go to shutter-t1-ok.vi, and add the condition there. Generate an error if the pressure is too low.

Everything will still work.

No side effects, no risk, no unintended consequences.

4.4 Transition Server

Once we have read the required transition out of the state table, we need to be able to execute it. The transitions are "served" out by a transition server, which simply offers you a pull-down menu of transitions. Note that no logical sequencing is performed; if you ask for a transition that isn't allowed in the current state, the -ok.vi should catch it by examining interlocks.

4.5 State Machine VI (State Server)

Finally, all these pieces are integrated into a top-level VI, the state machine (or state server, in analogy with transition server). It keeps track of the current state; you select the desired state, and it does whatever it takes to get you there.

Important point: this table-driven approach allows you to get from any allowed state to any other allowed state, and the state machine VI gets you there with the minimum number of transitions. The states are sequenced entirely by the contents of the state table, instead of being inflexibly wired in some pre-chosen order by LabVIEW wires.

4.6 Initialization & Recovery

The state machine approach simplifies initialization and error recovery. The state machine keeps the current state variable in an uninitialized shift register on a while loop (translation: it remembers its state variable between one run of the machine and the next run; but the state variable shows up as "uninitialized" on the very first run). When the uninitialized condition is detected, the state machine goes out and examines the hardware sensors to try and deduce what state the mechanism is in. It is often successful, and because the table-driven state machine can easily move between any two states, it can seemlessly pick up from wherever it is.

Put another way: if the system crashes and leaves the slitmask in state S3, then after a reboot the state machine will detect that, and correctly move you to your desired state (say, S5). It won't be confused by the fact that it didn't wake up in state S1.

All this is done without extensive wiring or densly nested case statements. It's all driven by the state table.

5 Global Variables: Low Level and High Level

In addition to spaghetti wiring, Global Variables in LabVIEW can be another source of obfuscation and complexity. We use global variables sparingly, and with discipline. We use Global Variables to hold the sensor readings that come out of the hardware: limit switches, end-of-travel sensors, encoder voltages, temperature readings, etc.

We view these inputs on two levels, which I have variously called

I can't decide, but let's use unnamed/named here.

Take the digital inputs as an example. The National Instruments PXI-6508 is a device that accepts 96 digital (on or off) inputs via 12 ports of 8 inputs, or bits, each. Reading a port out of this device can tell you that port 7, bit 3 is on. This is an example of an unnamed input. Its name "port 7 bit 3" is almost like no name at all, if you don't have the decoder sheet with you.

Suppose port 7 bit 3 is connected to the Slitmask Inserted sensor. "Slitmask Inserted" is an example of a named input. Much more useful. We ingest the 6508 inputs in two steps:

  1. Read the port and bit data into a global variable (pxi-6508-global.vi). This is a simple 12x8 array of unnamed boolean indicators.
  2. Copy bits out of the 6508 Global Variable into other, mechanism-specific Globals with named indicators (e.g. slitmask-global.vi). Each mechanism has a "populate" VI that does this transfer from unnamed to named storage. See slitmask-populate.vi.

Analog Inputs

There are a few analog inputs associated strictly with mechanism control. We use three string-drawn potentiometers to sense the rough position of the slitmask, grating, and filter magazines. We leave these to their mechanism-specific modules to handle.

Most of the analog inputs are more general: power system voltages, temperatures, air flows, humidities, and so on. We collect these into unnamed storage in pxi-6071-global.vi, and then into a Global of named indicators called envriro-global.vi. The transfer from unnamed storage to named storage is done by enviro-populate.vi, which also handles converting raw analog voltages into physical units (degrees C, liters per minutes, etc.).

6 Example - Adding A New Mechanism

Now we will step through a new mechanism with examples. Let's use the slitmask mechanism as the example, so we can look at the VIs as we mention them.

We create three types of LabVIEW modules:

  1. Strict Typedefs. These are like Structs in C, if you know C. You define a set of variables in one place, and use them over and over again in your application.
  2. Global Variables. We use these carefully, and with discipline.
  3. VIs. These contain the executable code.

6.1 Create Standard Typedefs

We make all these modules "Strict" Typedefs.

6.2 Create Globals

Now we create our mechanism-specific global variables.

Let's say a few words about mechanism constants. To move the slitmask magazine from one station to the next, we need to know things like this:

We have taken special pains to collect these kinds of data into one well-known place (slitmask-constants.vi). We then calculate other things (e.g. steps per station) from these basic constants. We never embed a value such as a gear ratio into some random VI that happens to need it. If you replace the drive screw with one of a different pitch, changing the value here in the slitmask-constants.vi global will cause everything to behave correctly with no further changes anywhere.

6.3 Create Transitions

For each transition, build three VIs.

6.4 Mid-level - State Detection, Transition Server (operate)

Build the slitmask-state-detect.vi and the slitmask-state-table-index.vi modules. These are used by the state machine. Construct the transition server slitmask-operate.vi (diagram is here). This just executes a transition based on the selected value in the enumeration.

6.5 High-level - State Server

Construct the state server (the state machine) slitmask.vi (diagram is here). These look just the same across all the mechanisms, owing to the uniformity of design in the lower level modules.

7 PFIS State Diagrams

Low-Level

A word about motorized axes. The PFIS complicates axis movement in a variety of ways, with brakes, detentes, and various requirements on turning the motors on and off. Some axes are easy, like the slitmask, where the motor can just be switched off any time. The articulation motor, on the other hand, requires a specific sequence of interlocked events before the motor can be turned off. The complexity of managing the articulation motor actually rises to the level of needing its own state machine, with all the ancillary VIs required by our strict and uniform approach.

In fact, because some axes require state machines, we decided to require all axes to have them. That way, should any axes have a change in requirements, the structure is already in place to handle it.

The axis-management state machines are numbered not by mechanism number, but by axis number (this is because of namespace collisions in LabVIEW's error number management).

Mid-Level

High-level

8 PFSI Error Codes

LabVIEW maintains a data base of error codes and text messages. When a VI generates an error, the code is used to look up the message, which is displayed to the user. Users are allowed to augment this data base with user-defined error codes. LabVIEW reserves the range 5000-9999 for user-defined codes.

The problem is how to assign PFIS error conditions to this range of codes, while maintaining some sort of logic to the numbering. Our requirement is to be able to distinguish types of errors (interlocks vs. timeouts) and to distinguish subsystems (slitmask vs. waveplate), and to do this in a numeric range of only 5000 codes.

We chose a scheme of the form cssn where

For the 6ssn and 7ssn codes, n refers to the transition number in the state diagram in which the error occurred. No PFIS state machine has more than a single-digit number of transitions.

For example, the code 6052 means that you tried to Fetch a slitmask, but you were interlocked out of doing it. The 6 means interlock, the 05 is slitmask, and the 2 means T2, or Fetch.

We created a FileMaker Pro data base of these codes, and export them in a form acceptable to LabVIEW.

We have reserved values for the infrared beam. IR beam mechanisms can use 14-19, and its motorized axes can be numbered in the range 51-54.