Faster Than a Speeding Bullet

Hacker’s Guide to Visual FoxPro
An irreverent look at how Visual FoxPro really works. Tells you the inside scoop on every command, function, property, event and method of Visual FoxPro.

Faster Than a Speeding Bullet

Speed is where it’s at. No client is going to pay you to make his app run slower. Fox Software’s initial claim to fame was that FoxBASE (and later, FoxBASE+ and FoxPro) ran faster than its competitors. Fox’s reputation for speed was well deserved. However, the speed gain has never been automatic. You have to do things right to make your code “run like the Fox.”

The Rushmore optimization technology introduced in FoxPro 2.0 is based on indexes. Rushmore examines index files to determine which records meet the conditions for a particular command. So, in order to make things fast, it’s important to create the right indexes and to write code that takes advantage of those indexes. The difference between the lightning-fast code your client uses to make crucial strategic decisions and the plodding code his competition uses might differ by no more than an operator or two, so listen up!

There are also some other tricks, not related to Rushmore, that can speed up your applications considerably. This section discusses both Rushmore and non-Rushmore optimization techniques.

Scared by a Mountain in South Dakota?

Fox Software always claimed that Rushmore was named after Dr. Fulton and the development team watched Hitchcock’s North by Northwest. But we have no doubt the name caught on due to the phrase “rush” embedded in it. In fact, some of the FoxPro documentation and advertising used the phrase “RUSH me some MORE records.”

As we mentioned above, the key to getting the most from Rushmore is to create the right indexes and take advantage of them. So how do you know which are the right indexes, and how do you take advantage of them?

Rushmore can optimize the SET FILTER command, and any command involving a FOR clause, as well as SELECT-SQL. The secret (not really a secret—it is documented) is to make the left-hand side of each expression in the filter, FOR or WHERE clause exactly match an existing index tag. For example, to optimize:

SUM OrderTotal FOR state="PA"

you need an index tag for state. If your tag is on UPPER(state), instead, you’d want to write the command as:

SUM OrderTotal FOR UPPER(state)="PA"

Suppose you want to find everyone named Miller in a table of Clients and that you have a tag on UPPER(cLastName+cFirstName) to put folks in alphabetical order. You optimize the BROWSE by writing it as:

BROWSE FOR UPPER(cLastName+cFirstName)="MILLER"

even though you’re really interested only in the last name.

It’s All in What You Index

We’ve answered the second question—how to take advantage of existing tags—but we still haven’t tackled the first: What are the right indexes to create? That’s because it’s not always straightforward. There are a few clear-cut rules, but to a great extent, you’ll need to use your judgment and test your theories against your data, on your hardware. Here are the rules:

Create a tag for your primary key, the field that uniquely identifies the record. (Do this whether or not you define it as a primary key in the database.) You’ll need it to look up particular records and for setting relations. (If your table is in a database, you’ll want a tag for the primary key anyway for creating persistent relations.)
Create a tag for any field or expression you expect to search on frequently.
Create a tag for any field or expression you think you’ll want to filter data on frequently. (This is to let Rushmore kick in.)
Make sure the tags you create exactly match the conditions you’ll need to search or filter on.
Don’t automatically create tags on every field. (That’s called inverting the table.) It can make adding and updating records slower than necessary, especially if you have a lot of fields in your table. On the flip side, if you have a table, especially one that is rarely changed, where you do use every field in filters, then go ahead and invert the table.
Do not create indexes with a NOT expression for Rushmore optimization. Rushmore ignores any tag whose expression contains NOT. If you need the NOT expression, say, for a filter, create both indexes, one with and one without the NOT.
Don’t filter your tags. That is, don’t use the FOR clause of the INDEX command. Rushmore ignores tags that are filtered. If you need a filtered tag for some reason, and you’re likely to filter on that tag’s index expression as well, create an unfiltered tag, too. In general, you’ll be trading off update speed for search speed. So, think about what you expect to do with this table. If it’s going to have lots of additions but few searches, keep the number of tags to a minimum. If it’ll be used for lots of searching, but rarely updated, create more tags. You have other middle-of-the-road options, too: If you do an intensive filtering process once a month, but primarily do updates the rest of the time, consider creating temporary tags (in their own CDX, or as stand-alone compact indexes) for the process. On the flip side, if you primarily handle queries against the data, but add large batches of the data infrequently, consider dropping all of the tags for the bulk load (after saving their definitions, of course) and then re-creating them afterwards.

To Delete or Not to Delete

For many years (and in the earlier editions of this book), one of the mantras of optimization advice was to index every table on the DELETED() function, if an application was to run with SET DELETED ON. This advice was accepted by virtually everyone who knew anything about Rushmore.

Here’s the way we explained it in the VFP 6 edition of this book:

“Even in many complex queries and FOR clauses, Rushmore performs its magic almost entirely on the relatively small and compact CDX file, a file structured with nodes, branches and leaves to be searched efficiently. When DELETED is ON, FoxPro has to check each and every record in a result set (whether from a query, a filter, or FOR) to see if it’s deleted—even if no records are actually deleted. This sequential reading of the entire cursor or file completely defeats the benefits of Rushmore. Don’t do it!

“By creating a tag on DELETED(), you let Rushmore do the checking instead of looking at each record sequentially, which makes the whole thing much faster. The larger the result set, the more speed-up you’ll see.”

Sounds really good, doesn’t it?

The walls came crashing down on this piece of wisdom with an article by Chris Probst in the May ‘99 issue of FoxPro Advisor. Probst was working with an extremely large data set in a network environment and found that some of the queries were just too slow. Monitoring network traffic found the problem: The portion of the index file related to the DELETED() tag was huge (at least in some cases), and transferring it across the network was bogging down the whole query.

Probst’s experiments determined that, when an expression has only a few discrete values (like .T. and .F.), and the values are unevenly distributed (as is typically the case with DELETED(), since in general, few records are deleted), queries run faster without a tag on the expression. In those cases, it’s better to let VFP narrow things down first based on the other expressions involved, and then do a sequential check of the remaining records.

The phenomenon Chris witnessed is not unheard of in the database world. The index with few values is called one with “low selectivity” and is considered suspect in most database designs. But this was the first documented case of the low-selectivity issue appearing in FoxPro’s remarkable optimization.

What does all this boil down to for you? For small tables in a desktop or LAN situation, we generally think having a tag on DELETED() is a better choice. As tables get larger or when operating in a WAN situation, skip it. And just for good measure, you should test the performance, in your specific production (as opposed to development) environment, with and without the DELETED() tag.

Going Nowhere Fast

Another common problem goes like this. In troubleshooting sessions we attend, someone complains that a filter should be optimized, but it’s dog slow. He’s asked to show the filter and the tags. Everything looks good for Rushmore to optimize the filter. Puzzling.

Then he shows the code he’s using. Typically, it looks something like this:

SET FILTER TO <something optimizable>
GO TOP    && put filter in effect

and the light goes on. GO TOP and GO BOTTOM are not optimizable commands. They move through the records sequentially, attempting to find the first record matching the filter.

Without a filter (and with SET DELETED OFF), this isn’t generally a problem. Moving to the top or bottom of the current order is pretty quick. FoxPro can either locate the first or last record in the index or, if no tag is set, move directly to the beginning or end of the file.

But when a filter is set (or DELETED is ON, which is like having a filter set), once GO gets to the first or last record in the order, it has to search sequentially for the first record that matches the filter condition. This is what’s so slow. Smart like a fox, eh? What a dumb idea! This is like you writing code to go to record 10 by issuing a SKIP, asking if this is RECNO()=10, and if not, SKIPping again.

What can you do about it? Don’t use GO TOP and GO BOTTOM. How do you avoid them? By using a neat trick. It turns out that LOCATE with no FOR clause goes to the first record in the current order. So, for GO TOP, you just issue LOCATE, like this:

SET FILTER TO <optimizable condition>
LOCATE   && Same as GO TOP

Okay, that works for finding the first record. What about the last record? You have to stand on your head for this. Well, almost. You really have to stand the table on its head. Try it like this:

SET FILTER TO <optimizable condition>
 
* Reverse index order
lDescending=DESCENDING()
IF lDescending
   SET ORDER TO ORDER() ASCENDING
ELSE
   SET ORDER TO ORDER() DESCENDING
ENDIF
* Now Top is Bottom and Bottom is Top
LOCATE  && Same as GO TOP
 
IF lDescending
   SET ORDER TO ORDER() DESCENDING
ELSE
   SET ORDER TO ORDER() ASCENDING
ENDIF

After setting the filter (or with a filter already in effect), you turn the index upside down. If it was ascending, you make it descending; if it was descending, you make it ascending. Then, use LOCATE to go to the first record. Since you’ve reversed the order, that’s the last record in the order you want. Then, reverse the order again. Voila! You’re on the bottom record.

By the way, the code above works only if there is an index order set. If there might be no order, you have to check for that.

One more warning. Under particular circumstances, the work-around can be very slightly slower than just using GO. In most cases, though, it tends to be an order of magnitude faster. We think it’s worth it.

We should also comment that, in VFP 7, we’ve seen some cases where GO TOP and GO BOTTOM seem to be behaving in a more optimized way. But enough of our tests still show the LOCATE technique to be faster for us to stick with it.

HAVING noWHERE Else To Go

SQL-SELECT has two clauses that filter data: WHERE and HAVING. A good grasp of the English language might lead us to believe that these are synonyms, but SQL is not English, and mixing these two indiscriminately is a sure-fire disaster in the making! It’s not obvious where a particular condition should go at first glance. But getting it wrong can lead to a significant slowdown.

Here’s why. The conditions in WHERE filter the original data. Wherever possible, existing index tags are used to speed things up. This produces an intermediate set of results. HAVING operates on the intermediate results, with no tags in sight. So, by definition, HAVING is slower than WHERE, if a query is otherwise constructed to be optimized.

So, when should you use HAVING? When you group data with GROUP BY and want to filter not on data from the original tables, but on “aggregate data” formed as the result of the grouping. For example, if you group customers by state, counting the number in each, and you’re interested only in states with three or more customers, you’d put the condition COUNT(*)>=3 in the HAVING clause.

SELECT cState,COUNT(*) ;
       FROM Customer ;
       GROUP BY cState ;
       HAVING COUNT(*)>=3

A simple rule of thumb: Don’t use HAVING unless you also have a GROUP BY. That doesn’t cover all the cases, but it eliminates many mistakes. To make the rule complete, remember that a condition in HAVING should contain one of the aggregate functions (COUNT, SUM, AVG, MAX or MIN) or a field that was named with AS and uses an aggregate function.

Unfortunately, simple rules aren’t always the best rules. There’s one other situation where you may choose to use HAVING rather than WHERE. That’s when you’re specifically trying to avoid having Rushmore come into play. When would that be? When you have an index on a field (presumably because you need it elsewhere), but there are only a few distinct values for the field. If the query has other optimizable fields that narrow the result set down to a small percent of records, the cost of reading the relevant part of this index may be greater than the cost of sequentially checking the records isolated by the other conditions. In that case, moving the condition to the HAVING clause means that Rushmore ignores it, and the index isn’t read.

The Only Good Header is No Header

FoxPro lets you store procedures and functions in a variety of places. But using the Project Manager gives you a strong incentive to put each routine in a separate PRG file. We generally agree with this choice.

But, if you’re not careful, in versions through the original release of VFP 6, there’s a nasty performance penalty for doing so. It turns out that having a PROCEDURE or FUNCTION statement at the beginning of a stand-alone PRG file increases the execution time by a factor of as much as 10!

You read that right. It can take 10 times as long to execute a PRG that begins with PROCEDURE or FUNCTION as one with no header. Hearing about this goodie (no, we didn’t discover it ourselves), we tested a couple of other alternatives. It turns out that using DO <routine> IN <PRG file> cuts the penalty down some, but it’s still twice as slow as simply eliminating or commenting out the header line.

SETting PROCEDURE TO the PRG, then calling the routine, speeds things up if you only have to do it once, but issuing SET PROCEDURE TO over and over again (as you’d need to for many different PRGs) is about 20 times slower than the slow way. That is, it’s 200 times slower than omitting the header in the first place. Even issuing SET PROCEDURE TO ... ADDITIVE repeatedly is slower than just setting it once.

But wait, there’s more. Not surprisingly, if the routine you’re calling isn’t in the current directory, but somewhere along a path you’ve set, it takes a little longer. For an ordinary routine with no header, the difference isn’t much. Same thing if you’re using SET PROCEDURE. However, the other two cases get a lot slower when they have to search a path. Using DO <routine> IN <PRG file> when the file isn’t in the current directory is just about as slow as doing a SET PROCEDURE. But that’s only the bad case. The horrible situation is calling a routine with a PROCEDURE or FUNCTION header directly—it can be as much as 1000 times slower than calling the same routine without the header!

The good news is that the path penalties go away as soon as you add the routines to a project and build an APP or EXE. That is, unless you’re running in a very unusual setup, your users are unlikely to pay this price.

The better news is that the penalty seems to be gone in VFP 6 SP3 and later, where the differences in the way you set up and call a routine are small enough to matter only in situations where you need to squeeze the last drop of performance out. In those cases, you’ll need to test all the alternatives in your configuration to see what produces the best results.

Watch Out for Breakpoints

When we were testing the procedure header issue in VFP 7, we ran into some results that didn’t match our expectations at all: wide variations in test speeds across our different machines, and major differences between VFP 6 and VFP 7.

It took quite a while (and a pointer from Mike Stewart of Microsoft) to pin down the difference: What mattered was whether any breakpoints were set in the testing environment. It turns out that, in VFP 7, even with the debugger closed, having breakpoints set slows down execution significantly. In earlier versions, you pay the penalty for breakpoints only if the debugger is open.

We’re not sure why Microsoft changed this, and we’ll reserve opinion on its wisdom until we know. What we are sure about, though, is that before you do any performance testing in VFP 7, make sure you CLEAR ALL breakpoints. (It’s not enough to uncheck all the breakpoints listed in the Breakpoints dialog. You actually have to remove them.)

Fortunately, this slowdown won’t affect most end-users, since they don’t have the development version of VFP.

Loops Aren’t Just for Belts

FoxPro offers three (well, really, four) different ways to write a loop. Choosing the right one can make a big difference in your program. So can making sure you put only what you have to inside the loop.

Let’s start with the second statement. Every command or function you put inside a loop gets executed every time through the loop. (Big surprise.) Put enough extra stuff in there and you can really slow a program down. The trick is to put each statement only where you need it. This is especially true when you’ve got nested loops—putting a command farther in than it has to be might mean it gets executed dozens more times than necessary.

Bottom line here: If the command doesn’t depend on some characteristic of the loop (like the loop counter or the current record) and it doesn’t change a variable that’s changed elsewhere in the loop, it can probably go outside the loop.

Here’s an example:

* Assume aRay is a 2-D array containing all numeric data.
* We're looking for a row where the sum of the first three columns is 
* greater than 100.
lFound = .F.
nRowCnt = 1
DO WHILE NOT lFound AND nRowCnt<=ALEN(aRay,1)
   IF aRay[nRowCnt,1]+aRay[nRowCnt,2]+aRay[nRowCnt,3]>100
      lFound = .T.
   ELSE
      lFound = .F.
      nRowCnt=nRowCnt+1
   ENDIF
ENDDO

The version below eliminates repeated calls to ALEN() and the need for the lFound variable. Our tests in VFP 7 show that it’s about an order of magnitude faster than the original code.

nNumofRows = ALEN(aRay,1)
DO WHILE aRay[nRowCnt,1]+aRay[nRowCnt,2]+aRay[nRowCnt,3] <= 100 and ;
         nRowCnt < nNumofRows
  nRowCnt = nRowCnt + 1
ENDDO

We find we’re most likely to make this particular mistake when we’re dealing with nested loops, so scrutinize those especially carefully.

What’s This Good FOR?

In the case of loops that execute a fixed number of times, FOR is a better choice than DO WHILE. Because the counting and checking feature is built into FOR, it just plain goes faster than DO WHILE. In a simple test with a loop that did nothing at all except loop, FOR was more than 10 times faster than DO WHILE. Never write a loop like this:

nCnt = 1
DO WHILE nCnt <= nTopValue
   * Do something here.
   nCnt=nCnt+1
ENDDO

Always use this instead:

FOR nCnt = 1 TO nTopValue
   * Do something here.
ENDFOR

SCANning the Territory

Guess what? DO WHILE isn’t the best choice for looping through records either. SCAN was designed to process a table efficiently and does it faster than DO WHILE. Our results show that SCAN is one-and-a-half to two times faster to simply go through an unordered table one record at a time.

To give full disclosure, we have found that with some index orders, DO WHILE was as much as 20 percent faster. With other indexes, SCAN is faster, although it doesn’t appear to have the same advantage as in an unordered table. (It’s also worth noting that, with large tables, if the memory allocation to FoxPro isn’t property tuned—see below—DO WHILE can be faster than SCAN.)

A word to the wise here: When you’re tuning existing code, don’t just globally replace your DO WHILE loops with SCAN...ENDSCAN. SCAN has a built-in SKIP function—if your code already has logic to perform a SKIP within the loop, you can inadvertently skip over some records. Make sure to pull out those SKIPs.

`FOR EACH`, his own

There’s one more special-purpose looping construct: FOR EACH. It’s designed to go through the elements of a collection or an array, giving you access to each in turn. We like it a lot for working with collections, where it makes the code more readable.

We’re sorry to say that using FOR EACH doesn’t seem to offer any performance improvement over a counted FOR loop. On the other hand, it doesn’t seem to be measurably slower either, so use whichever gives you the greatest readability in any given situation.

To Wrap or Not to Wrap

One of the capabilities that OOP gives us is “wrapper classes.” These classes let us take a collection of related capabilities and put them all into a single class—“wrapping” them all up into one easy-to-use package. The class gives us a more consistent interface to the functions involved and generally presents a tidy package.

The Connection Manager class described in the Reference section (see SQLConnect()) is pretty much a wrapper class, though it adds some capabilities. We’ve seen folks suggest wrapper classes for the FoxTools library (which desperately needs a consistent interface despite the addition of lots of its residents to the language). During the beta test for VFP 3, we played around on and off for months with a wrapper class for array functions that would let us stop worrying about things like the second parameter to ALEN().

On the whole, wrapper classes sound pretty attractive. Unfortunately, they also add a fair amount of overhead.

There’s another way to do the same thing—just create an old-fashioned procedure file. Since SET PROCEDURE has an ADDITIVE clause, it’s no big deal to have lots of procedure libraries around. It turns out, of course, that procedure libraries also carry an overhead penalty.

So, if there’s significant overhead with wrapper classes and procedure files, what do you do? Because the contents of the class or library matter so much, it’s hard to produce benchmarks that give you hard and fast rules about this stuff. We tested with our embryonic array handler class, using only some of the simpler methods included (aIsArray, aElemCount, aRowCount, aColCount, aIs2D—all of which do exactly what their names suggest). We set it up as a class and as a procedure library. Then, we wrote a program that made a sample series of calls. We also wrote the same functionality in native code (ALEN() for aElemCount, ALEN(,1) for aRowCount and so on).

The sad result is that either a procedure library or a class is an order of magnitude slower than using the built-in functionality. In this example, the procedure library was pretty much always faster than the class, but the exact difference varied and wasn’t enough to worry about.

We also tested the same functions as stand-alone programs. The timing came out pretty much the same as the procedure library and the class. (Well, mostly. One of us found the stand-alone programs uniformly another order of magnitude slower. We’re not sure what was going on there, perhaps the problem described in “Watch Out for Breakpoints.”)

Our guess is that, as functionality becomes more complex, the overhead counts less. Given the other, overwhelming benefits of using modular code, we don’t recommend you stop writing procedures. But, at this point, we can’t recommend wrapper classes where a procedure library would do. The key question to answer is whether you’re creating an object that has behaviors. Typically, behaviors need some common data, such as a set of properties to store the initial values of SET commands in an object that uses the Init and Destroy methods to push/pop the SETtings. If all you need to do is perform some function (transform some data, for example), use a function. But when you need more than that, go for the wrapper class.

There are some benefits to a wrapper class, of course. The biggest benefit is the ability to subclass to provide specialized behaviors. Where this is a possibility, it’s worth the overhead.

What’s in a Name?

You wouldn’t think that a little thing like a name would matter so much. But it does. The name we’re referring to is the Name property possessed by almost every object you can create in Visual FoxPro. (A few of the weird marriages of Xbase to OOP, like SCATTER NAME, produce objects without a Name property.)

For code-based classes, when you CREATEOBJECT() an object whose class definition doesn’t assign a value to the Name property, Visual FoxPro makes one up for you. That’s nice. Except, prior to VFP 7, it insists on making it unique (usually, the class name or a variant thereof, followed by one or more digits, like Form3 or Text17). The problem is, as the number of objects of that class grows, making sure a name is unique takes longer and longer. The Microsoft folks say the time grows exponentially. We suspect that’s an overstatement and that it’s actually geometric. What it ain’t is linear. What it really ain’t is fast enough. (Before we go any further with this, we should point out why this applies only to code classes. All VCX-based classes have an implicit assignment of the Name property, so there’s never a need to assign a name at instantiation time.)

We tested in VFP 6 with a pair of very simple classes based on Custom. One contained nothing. The other contained an explicit assignment to Name. With 10 repetitions, the explicitly named class would instantiate so fast it couldn’t be measured, but the nameless class was fast, too. By 100 repetitions, explicit naming was more than four times as fast. At 1000 repetitions, the explicit version was eight to 10 times faster. At 5000 of each class, explicit names are about 18 times faster than nameless objects to instantiate.

The moral of the story here is easy. In VFP 6 and earlier, always assign a value to the Name property for any class you write in code.

The happy ending is that Microsoft changed this behavior in VFP 7 and now, newly instantiated classes simply take the class name as their Name.

Looks Can Be Deceiving

But, in this case, they’re not. The form property LockScreen lets you make a series of changes to a form without the individual changes showing as you go. When you SET LOCKSCREEN to .F., all the changes appear to occur simultaneously. Visually, it’s far more consistent.

We were all set to tell you that this is one of those times where the user’s eyes will play tricks on him. He’ll think the update is faster because he doesn’t see the individual changes take place.

But guess what? The update really is faster this way. We tested a simple form with just a few controls. We changed about 20 properties of the form and the controls. With LockScreen set to .T., the updates were very slightly faster in each case. Surprise—the version that looks better is faster, too. We’re fairly certain it’s because Windows has to redraw the screen only once—and screen redraws aren’t the fastest thing that the operating system can do.

What Type of Var are You?

Testing the type of a variable or field is one of those things we do a lot in our code. Starting in VFP 6, it’s something we can do faster than ever. The VARTYPE() function is significantly faster than its predecessor, TYPE(). How much faster? With both variables and fields, we consistently find VARTYPE() two to three-and-a-half times faster than TYPE(). The smallest differences occurred when checking non-existent variables. (A code maintenance and debugging tip here: Not only is VARTYPE() faster, but you don’t have to pass the quoted variable name to VARTYPE(), eliminating one more of those needless development errors. VARTYPE(MyVar) is functionally equivalent to TYPE("MyVar").)

One warning here: VARTYPE() is appropriate only for fields, variables, properties and the like. If you use it with expressions to find out what type the result will be, be aware that you actually evaluate the expression (in fact, it’s evaluated before it’s even passed to VARTYPE()). In addition, VARTYPE() is useful only for items that can be evaluated. In particular, you can’t use it to check whether a particular property exists, which you can do with TYPE(). For example, VARTYPE(oObject.Name) fails if oObject doesn’t exist or doesn’t have a name property. However, TYPE("oObject.Name") works, returning “U” in that case. So, don’t throw TYPE() out of your toolkit quite yet.

Stringing Along

Building up strings has become increasingly important, as we need to create HTML and XML and who-knows-what-else-ML. Fortunately, VFP has been tuned to make construction of large strings extremely efficient.

However, there’s one important trick you need to know. Building a string from left to right is fast; building it from right to left isn’t. What does that mean? It means that a series of assignments in the form:

cString = cString + cMoreStuff

is much faster than:

cString = cMoreStuff + cString

regardless of the actual contents of the variables.

How much faster? That does depend on the contents of the variables or, more specifically, the length of the strings involved. As the string gets larger, the first version gets better and better. To build a string of 30,000 characters, adding three at a time, the first version runs about two orders of magnitude faster than the second.

How to Create an Object and Other Mysteries of Life

VFP 6 also introduced a new way to create objects. The NewObject() function lets you instantiate objects without worrying about whether you’ve pointed to the class library ahead of time—instead, you just include the library name in the call. CREATEOBJECT(), of course, needs the library in the current list with either a SET CLASSLIB or SET PROCEDURE ahead of time.

So which way is faster? As usual, the answer is “it depends.” With VCX-based classes, if you can issue SET CLASSLIB just once and then instantiate classes from that library repeatedly, CREATEOBJECT() is the way to go. It’s anywhere from four to ten times faster than calling NewObject() with the class library. On the other hand, if you need to load the library each time, the SET CLASSLIB/CREATEOBJECT() pair is in the same ballpark as NewObject().

How about for classes written in code? In that case, issuing a single SET PROCEDURE and calling CREATEOBJECT() repeatedly is five to ten times faster than either NewObject() or the SET PROCEDURE/CREATEOBJECT() pair, which are pretty similar.

In the VFP 6 version of this book, we reported that instantiating a coded class was a little faster than instantiating a VCX-based class, but not enough faster to wipe out the benefits of developing classes visually. In VFP 7, we see varying results: Sometimes the coded class instantiates faster, while at other times, the VCX-based class is faster. Bottom line: We’ll stand by our advice to develop visual objects visually.

We’re not really surprised that NewObject() is generally slower than CREATEOBJECT(). It’s doing a lot of work behind the scenes. Here’s the sequence: Save the names of all open class libraries, then close them. Next, open the specified class library, instantiate the object, and close the class library. Finally, reopen all the formerly open class libraries. Whew, that’s a lot of files to find and mess with. Of course, we don’t understand why VFP doesn’t check whether the specified library is in the current list before going to all that trouble—seems to us it could speed up NewObject() considerably, in most cases.

We tested and found no performance penalty for having a lot of class libraries open, no matter where in the list the class you’re instantiating is found. So the rule here is to think about how you’re going to do things before you write the code and, if possible, just keep open the class libraries you use a lot. Then use NewObject() for the one-shots, the classes from libraries you need only once in a while.

Can You Have Too Much Memory?

It turns out that, in VFP, the answer is “yes.” When you start VFP, it figures out how much memory it ought to be able to use, if it needs it. The number is generally about half as much as the machine actually has. Often, the amount that VFP picks is too much.

How can you have too much memory? Like this, according to our buddy Mac Rubel, who knows more about this topic than anyone else—more even, we suspect, than the folks who wrote VFP. However much memory VFP grabs, it assumes it has that much physical memory to work with. But, because it takes so much memory, it often doesn’t—some of the memory it’s working with is really disk space pretending to be memory, and that’s slow. By decreasing the amount of memory VFP thinks it has available, you ensure that it uses only physical memory. VFP knows what to do when it needs more memory than it has available, and it’s good at that. The last thing you want happening is the operating system swapping virtual (disk) memory for real memory while FoxPro thinks it is using RAM. So, as long as you restrict it to using physical memory, things are fast, fast, fast.

Okay, so how do you that? Use the SYS(3050) function. SYS(3050,1) controls foreground memory, the memory VFP has available when it’s in charge. SYS(3050,2) is for background memory—how much memory FoxPro should have when you’re off doing something else. In either case, you pass it a number and it rounds that down to a number it likes (multiples of 256), and that’s how much memory it uses. It even tells you how much it really took.

We’ve been amazed how much of a difference this setting can make. Unfortunately, getting it just right is a matter of trial and error, so try a bunch of different settings until you find the one that seems suited to the way you work.

Practice, Practice, Practice

All of the tips we’ve given you here should speed up your code, but your application on your network with your data is the true test (and “your” refers to the final production environment, not necessarily your development environment). Differences in network throughput, the architecture of your system, the design of your tables, your choice of indexes, the phase of the moon, what’s on TV that night, and so forth, all make significant differences in the performance you see. Our advice is always to examine and benchmark how a particular change affects your system. Keep in mind that a single test isn’t conclusive unless it can be repeated, and you need to repeat tests with caution because FoxPro and the operating system and the network and even your disk controller might be caching information. Also, don’t forget to shut down background processes like your virus scanner and your e-mail client while doing performance testing.

Finally, now that we’ve given you all this advice, keep in mind that most of the time, you don’t need to worry about a few milliseconds here or there. Don’t spend a lot of energy optimizing until you know there’s a problem. Of course, when there’s a clear choice and it’s also good practice (like using SCAN instead of DO WHILE to process a table), write your applications using the better choice. But, your first task is to get the application working and producing the right results.

If there’s a speed problem at that point, use the Coverage Profiler to help you figure out where the bottleneck is. Then, follow longtime Fox guru George Goley’s advice and “take out the slow parts.” That’s where the information here should be most useful.