Hacker’s Guide to Visual FoxPro
An irreverent look at how Visual FoxPro really works. Tells you the inside scoop on every command, function, property, event and method of Visual FoxPro.
Speed is where it’s at. No client is going to pay you to make his app run slower. Fox Software’s initial claim to fame was that FoxBASE (and later, FoxBASE+ and FoxPro) ran faster than its competitors. Fox’s reputation for speed was well deserved. However, the speed gain has never been automatic. You have to do things right to make your code “run like the Fox.”
The Rushmore optimization technology introduced in FoxPro 2.0 is based on indexes. Rushmore examines index files to determine which records meet the conditions for a particular command. So, in order to make things fast, it’s important to create the right indexes and to write code that takes advantage of those indexes. The difference between the lightning-fast code your client uses to make crucial strategic decisions and the plodding code his competition uses might differ by no more than an operator or two, so listen up!
There are also some other tricks, not related to Rushmore, that can speed up your applications considerably. This section discusses both Rushmore and non-Rushmore optimization techniques.
Fox Software always claimed that Rushmore was named after Dr. Fulton and the development team watched Hitchcock’s North by Northwest. But we have no doubt the name caught on due to the phrase “rush” embedded in it. In fact, some of the FoxPro documentation and advertising used the phrase “RUSH me some MORE records.”
As we mentioned above, the key to getting the most from Rushmore is to create the right indexes and take advantage of them. So how do you know which are the right indexes, and how do you take advantage of them?
Rushmore can optimize the SET FILTER
command, and any
command involving a FOR
clause, as well as SELECT-SQL
. The secret (not really a
secret—it is documented) is to make the left-hand side of each expression in
the filter, FOR
or WHERE
clause exactly match an existing index tag. For
example, to optimize:
SUM OrderTotal FOR state="PA"
you need an index tag for state. If your tag is on
UPPER(state)
, instead, you’d want to write the command as:
SUM OrderTotal FOR UPPER(state)="PA"
Suppose you want to find everyone named Miller in a table of
Clients and that you have a tag on UPPER(cLastName+cFirstName)
to put folks in
alphabetical order. You optimize the BROWSE
by writing it as:
BROWSE FOR UPPER(cLastName+cFirstName)="MILLER"
even though you’re really interested only in the last name.
We’ve answered the second question—how to take advantage of existing tags—but we still haven’t tackled the first: What are the right indexes to create? That’s because it’s not always straightforward. There are a few clear-cut rules, but to a great extent, you’ll need to use your judgment and test your theories against your data, on your hardware. Here are the rules:
Create a tag for your primary key, the field that uniquely identifies the record. (Do this whether or not you define it as a primary key in the database.) You’ll need it to look up particular records and for setting relations. (If your table is in a database, you’ll want a tag for the primary key anyway for creating persistent relations.)
Create a tag for any field or expression you expect to search on frequently.
Create a tag for any field or expression you think you’ll want to filter data on frequently. (This is to let Rushmore kick in.)
Make sure the tags you create exactly match the conditions you’ll need to search or filter on.
Don’t automatically create tags on every field. (That’s called inverting the table.) It can make adding and updating records slower than necessary, especially if you have a lot of fields in your table. On the flip side, if you have a table, especially one that is rarely changed, where you do use every field in filters, then go ahead and invert the table.
Do not create indexes with a NOT
expression for Rushmore optimization. Rushmore ignores any tag whose expression contains NOT
. If you need the NOT
expression, say, for a filter, create both indexes, one with and one without the NOT
.
Don’t filter your tags. That is, don’t use the FOR
clause of the INDEX
command. Rushmore ignores tags that are filtered. If you need a filtered tag for some reason, and you’re likely to filter on that tag’s index expression as well, create an unfiltered tag, too.
In general, you’ll be trading off update speed for search
speed. So, think about what you expect to do with this table. If it’s going to
have lots of additions but few searches, keep the number of tags to a minimum.
If it’ll be used for lots of searching, but rarely updated, create more tags.
You have other middle-of-the-road options, too: If you do an intensive
filtering process once a month, but primarily do updates the rest of the time,
consider creating temporary tags (in their own CDX, or as stand-alone compact
indexes) for the process. On the flip side, if you primarily handle queries
against the data, but add large batches of the data infrequently, consider
dropping all of the tags for the bulk load (after saving their definitions, of
course) and then re-creating them afterwards.
For many years (and in the earlier editions of this book),
one of the mantras of optimization advice was to index every table on the
DELETED()
function, if an application was to run with SET DELETED ON
. This
advice was accepted by virtually everyone who knew anything about Rushmore.
Here’s the way we explained it in the VFP 6 edition of this book:
“Even in many complex queries and FOR clauses, Rushmore
performs its magic almost entirely on the relatively small and compact CDX
file, a file structured with nodes, branches and leaves to be searched
efficiently. When DELETED
is ON
, FoxPro has to check each and every record in a
result set (whether from a query, a filter, or FOR
) to see if it’s deleted—even
if no records are actually deleted. This sequential reading of the entire
cursor or file completely defeats the benefits of Rushmore. Don’t do it!
“By creating a tag on DELETED()
, you let Rushmore do
the checking instead of looking at each record sequentially, which makes the
whole thing much faster. The larger the result set, the more speed-up you’ll
see.”
Sounds really good, doesn’t it?
The walls came crashing down on this piece of wisdom with an
article by Chris Probst in the May ‘99 issue of FoxPro Advisor. Probst
was working with an extremely large data set in a network environment and found
that some of the queries were just too slow. Monitoring network traffic found
the problem: The portion of the index file related to the DELETED()
tag was
huge (at least in some cases), and transferring it across the network was
bogging down the whole query.
Probst’s experiments determined that, when an expression has
only a few discrete values (like .T. and .F.), and the values are unevenly
distributed (as is typically the case with DELETED()
, since in general, few
records are deleted), queries run faster without a tag on the expression. In
those cases, it’s better to let VFP narrow things down first based on the other
expressions involved, and then do a sequential check of the remaining records.
The phenomenon Chris witnessed is not unheard of in the database world. The index with few values is called one with “low selectivity” and is considered suspect in most database designs. But this was the first documented case of the low-selectivity issue appearing in FoxPro’s remarkable optimization.
What does all this boil down to for you? For small tables in
a desktop or LAN situation, we generally think having a tag on DELETED()
is a
better choice. As tables get larger or when operating in a WAN situation, skip
it. And just for good measure, you should test the performance, in your
specific production (as opposed to development) environment, with and without
the DELETED()
tag.
Another common problem goes like this. In troubleshooting sessions we attend, someone complains that a filter should be optimized, but it’s dog slow. He’s asked to show the filter and the tags. Everything looks good for Rushmore to optimize the filter. Puzzling.
Then he shows the code he’s using. Typically, it looks something like this:
SET FILTER TO <something optimizable>
GO TOP && put filter in effect
and the light goes on. GO TOP
and GO BOTTOM
are not
optimizable commands. They move through the records sequentially, attempting to
find the first record matching the filter.
Without a filter (and with SET DELETED OFF
), this isn’t
generally a problem. Moving to the top or bottom of the current order is pretty
quick. FoxPro can either locate the first or last record in the index or, if no
tag is set, move directly to the beginning or end of the file.
But when a filter is set (or DELETED
is ON
, which is like
having a filter set), once GO
gets to the first or last record in the order, it
has to search sequentially for the first record that matches the filter
condition. This is what’s so slow. Smart like a fox, eh? What a dumb idea! This
is like you writing code to go to record 10 by issuing a SKIP, asking if this
is RECNO()=10
, and if not, SKIP
ping again.
What can you do about it? Don’t use GO TOP
and GO BOTTOM
.
How do you avoid them? By using a neat trick. It turns out that LOCATE
with no
FOR
clause goes to the first record in the current order. So, for GO TOP
, you
just issue LOCATE
, like this:
SET FILTER TO <optimizable condition>
LOCATE && Same as GO TOP
Okay, that works for finding the first record. What about the last record? You have to stand on your head for this. Well, almost. You really have to stand the table on its head. Try it like this:
SET FILTER TO <optimizable condition>
* Reverse index order
lDescending=DESCENDING()
IF lDescending
SET ORDER TO ORDER() ASCENDING
ELSE
SET ORDER TO ORDER() DESCENDING
ENDIF
* Now Top is Bottom and Bottom is Top
LOCATE && Same as GO TOP
IF lDescending
SET ORDER TO ORDER() DESCENDING
ELSE
SET ORDER TO ORDER() ASCENDING
ENDIF
After setting the filter (or with a filter already in
effect), you turn the index upside down. If it was ascending, you make it
descending; if it was descending, you make it ascending. Then, use LOCATE
to go
to the first record. Since you’ve reversed the order, that’s the last record in
the order you want. Then, reverse the order again. Voila! You’re on the bottom
record.
By the way, the code above works only if there is an index order set. If there might be no order, you have to check for that.
One more warning. Under particular circumstances, the
work-around can be very slightly slower than just using GO
. In most cases,
though, it tends to be an order of magnitude faster. We think it’s worth it.
We should also comment that, in VFP 7, we’ve seen some cases
where GO TOP
and GO BOTTOM
seem to be behaving in a more optimized way. But
enough of our tests still show the LOCATE
technique to be faster for us to
stick with it.
SQL-SELECT
has two clauses that filter data: WHERE
and
HAVING
. A good grasp of the English language might lead us to believe that
these are synonyms, but SQL is not English, and mixing these two
indiscriminately is a sure-fire disaster in the making! It’s not obvious where
a particular condition should go at first glance. But getting it wrong can lead
to a significant slowdown.
Here’s why. The conditions in WHERE
filter the original
data. Wherever possible, existing index tags are used to speed things up. This
produces an intermediate set of results. HAVING
operates on the intermediate
results, with no tags in sight. So, by definition, HAVING
is slower than WHERE
,
if a query is otherwise constructed to be optimized.
So, when should you use HAVING
? When you group data with
GROUP BY
and want to filter not on data from the original tables, but on
“aggregate data” formed as the result of the grouping. For example,
if you group customers by state, counting the number in each, and you’re
interested only in states with three or more customers, you’d put the condition
COUNT(*)>=3
in the HAVING
clause.
SELECT cState,COUNT(*) ;
FROM Customer ;
GROUP BY cState ;
HAVING COUNT(*)>=3
A simple rule of thumb: Don’t use HAVING
unless you also
have a GROUP BY
. That doesn’t cover all the cases, but it eliminates many
mistakes. To make the rule complete, remember that a condition in HAVING
should
contain one of the aggregate functions (COUNT
, SUM
, AVG
, MAX
or MIN
) or a field
that was named with AS
and uses an aggregate function.
Unfortunately, simple rules aren’t always the best rules.
There’s one other situation where you may choose to use HAVING
rather than
WHERE
. That’s when you’re specifically trying to avoid having Rushmore come
into play. When would that be? When you have an index on a field (presumably
because you need it elsewhere), but there are only a few distinct values for
the field. If the query has other optimizable fields that narrow the result set
down to a small percent of records, the cost of reading the relevant part of
this index may be greater than the cost of sequentially checking the records
isolated by the other conditions. In that case, moving the condition to the
HAVING
clause means that Rushmore ignores it, and the index isn’t read.
FoxPro lets you store procedures and functions in a variety of places. But using the Project Manager gives you a strong incentive to put each routine in a separate PRG file. We generally agree with this choice.
But, if you’re not careful, in versions through the original
release of VFP 6, there’s a nasty performance penalty for doing so. It turns
out that having a PROCEDURE
or FUNCTION
statement at the beginning of a
stand-alone PRG file increases the execution time by a factor of as much as 10!
You read that right. It can take 10 times as long to execute
a PRG that begins with PROCEDURE
or FUNCTION
as one with no header. Hearing
about this goodie (no, we didn’t discover it ourselves), we tested a couple of
other alternatives. It turns out that using DO <routine> IN <PRG
file>
cuts the penalty down some, but it’s still twice as slow as simply
eliminating or commenting out the header line.
SETting PROCEDURE TO the PRG, then calling the routine,
speeds things up if you only have to do it once, but issuing SET PROCEDURE TO
over and over again (as you’d need to for many different PRGs) is about 20
times slower than the slow way. That is, it’s 200 times slower than omitting
the header in the first place. Even issuing SET PROCEDURE TO ... ADDITIVE
repeatedly is slower than just setting it once.
But wait, there’s more. Not surprisingly, if the routine
you’re calling isn’t in the current directory, but somewhere along a path
you’ve set, it takes a little longer. For an ordinary routine with no header,
the difference isn’t much. Same thing if you’re using SET PROCEDURE
. However,
the other two cases get a lot slower when they have to search a path. Using DO <routine> IN <PRG file>
when the file isn’t in the current
directory is just about as slow as doing a SET PROCEDURE
. But that’s only the
bad case. The horrible situation is calling a routine with a PROCEDURE
or
FUNCTION
header directly—it can be as much as 1000 times slower than calling
the same routine without the header!
The good news is that the path penalties go away as soon as you add the routines to a project and build an APP or EXE. That is, unless you’re running in a very unusual setup, your users are unlikely to pay this price.
The better news is that the penalty seems to be gone in VFP 6 SP3 and later, where the differences in the way you set up and call a routine are small enough to matter only in situations where you need to squeeze the last drop of performance out. In those cases, you’ll need to test all the alternatives in your configuration to see what produces the best results.
When we were testing the procedure header issue in VFP 7, we ran into some results that didn’t match our expectations at all: wide variations in test speeds across our different machines, and major differences between VFP 6 and VFP 7.
It took quite a while (and a pointer from Mike Stewart of Microsoft) to pin down the difference: What mattered was whether any breakpoints were set in the testing environment. It turns out that, in VFP 7, even with the debugger closed, having breakpoints set slows down execution significantly. In earlier versions, you pay the penalty for breakpoints only if the debugger is open.
We’re not sure why Microsoft changed this, and we’ll reserve
opinion on its wisdom until we know. What we are sure about, though, is that
before you do any performance testing in VFP 7, make sure you CLEAR ALL
breakpoints. (It’s not enough to uncheck all the breakpoints listed in the
Breakpoints dialog. You actually have to remove them.)
Fortunately, this slowdown won’t affect most end-users, since they don’t have the development version of VFP.
FoxPro offers three (well, really, four) different ways to write a loop. Choosing the right one can make a big difference in your program. So can making sure you put only what you have to inside the loop.
Let’s start with the second statement. Every command or function you put inside a loop gets executed every time through the loop. (Big surprise.) Put enough extra stuff in there and you can really slow a program down. The trick is to put each statement only where you need it. This is especially true when you’ve got nested loops—putting a command farther in than it has to be might mean it gets executed dozens more times than necessary.
Bottom line here: If the command doesn’t depend on some characteristic of the loop (like the loop counter or the current record) and it doesn’t change a variable that’s changed elsewhere in the loop, it can probably go outside the loop.
Here’s an example:
* Assume aRay is a 2-D array containing all numeric data.
* We're looking for a row where the sum of the first three columns is
* greater than 100.
lFound = .F.
nRowCnt = 1
DO WHILE NOT lFound AND nRowCnt<=ALEN(aRay,1)
IF aRay[nRowCnt,1]+aRay[nRowCnt,2]+aRay[nRowCnt,3]>100
lFound = .T.
ELSE
lFound = .F.
nRowCnt=nRowCnt+1
ENDIF
ENDDO
The version below eliminates repeated calls to ALEN()
and
the need for the lFound variable. Our tests in VFP 7 show that it’s about an
order of magnitude faster than the original code.
nNumofRows = ALEN(aRay,1)
DO WHILE aRay[nRowCnt,1]+aRay[nRowCnt,2]+aRay[nRowCnt,3] <= 100 and ;
nRowCnt < nNumofRows
nRowCnt = nRowCnt + 1
ENDDO
We find we’re most likely to make this particular mistake when we’re dealing with nested loops, so scrutinize those especially carefully.
In the case of loops that execute a fixed number of times,
FOR
is a better choice than DO WHILE
. Because the counting and checking feature
is built into FOR
, it just plain goes faster than DO WHILE
. In a simple test
with a loop that did nothing at all except loop, FOR was more than 10 times
faster than DO WHILE
. Never write a loop like this:
nCnt = 1
DO WHILE nCnt <= nTopValue
* Do something here.
nCnt=nCnt+1
ENDDO
Always use this instead:
FOR nCnt = 1 TO nTopValue
* Do something here.
ENDFOR
Guess what? DO WHILE
isn’t the best choice for looping
through records either. SCAN
was designed to process a table efficiently and
does it faster than DO WHILE
. Our results show that SCAN
is one-and-a-half to
two times faster to simply go through an unordered table one record at a time.
To give full disclosure, we have found that with some index
orders, DO WHILE
was as much as 20 percent faster. With other indexes, SCAN
is
faster, although it doesn’t appear to have the same advantage as in an
unordered table. (It’s also worth noting that, with large tables, if the memory
allocation to FoxPro isn’t property tuned—see below—DO WHILE
can be faster than
SCAN
.)
A word to the wise here: When you’re tuning existing code,
don’t just globally replace your DO WHILE
loops with SCAN...ENDSCAN
. SCAN
has a
built-in SKIP
function—if your code already has logic to perform a SKIP
within
the loop, you can inadvertently skip over some records. Make sure to pull out
those SKIP
s.
FOR EACH
, his ownThere’s one more special-purpose looping construct: FOR
EACH
. It’s designed to go through the elements of a collection or an array,
giving you access to each in turn. We like it a lot for working with
collections, where it makes the code more readable.
We’re sorry to say that using FOR EACH
doesn’t seem to offer
any performance improvement over a counted FOR
loop. On the other hand, it
doesn’t seem to be measurably slower either, so use whichever gives you the
greatest readability in any given situation.
One of the capabilities that OOP gives us is “wrapper classes.” These classes let us take a collection of related capabilities and put them all into a single class—“wrapping” them all up into one easy-to-use package. The class gives us a more consistent interface to the functions involved and generally presents a tidy package.
The Connection Manager class described in the Reference section
(see SQLConnect()
) is pretty much a wrapper class, though it adds some
capabilities. We’ve seen folks suggest wrapper classes for the FoxTools library
(which desperately needs a consistent interface despite the addition of lots of
its residents to the language). During the beta test for VFP 3, we played
around on and off for months with a wrapper class for array functions that
would let us stop worrying about things like the second parameter to ALEN()
.
On the whole, wrapper classes sound pretty attractive. Unfortunately, they also add a fair amount of overhead.
There’s another way to do the same thing—just create an
old-fashioned procedure file. Since SET PROCEDURE
has an ADDITIVE
clause, it’s
no big deal to have lots of procedure libraries around. It turns out, of
course, that procedure libraries also carry an overhead penalty.
So, if there’s significant overhead with wrapper classes and
procedure files, what do you do? Because the contents of the class or library
matter so much, it’s hard to produce benchmarks that give you hard and fast
rules about this stuff. We tested with our embryonic array handler class, using
only some of the simpler methods included (aIsArray, aElemCount, aRowCount,
aColCount, aIs2D—all of which do exactly what their names suggest). We set it
up as a class and as a procedure library. Then, we wrote a program that made a
sample series of calls. We also wrote the same functionality in native code
(ALEN()
for aElemCount, ALEN(,1)
for aRowCount and so on).
The sad result is that either a procedure library or a class is an order of magnitude slower than using the built-in functionality. In this example, the procedure library was pretty much always faster than the class, but the exact difference varied and wasn’t enough to worry about.
We also tested the same functions as stand-alone programs. The timing came out pretty much the same as the procedure library and the class. (Well, mostly. One of us found the stand-alone programs uniformly another order of magnitude slower. We’re not sure what was going on there, perhaps the problem described in “Watch Out for Breakpoints.”)
Our guess is that, as functionality becomes more complex,
the overhead counts less. Given the other, overwhelming benefits of using
modular code, we don’t recommend you stop writing procedures. But, at this
point, we can’t recommend wrapper classes where a procedure library would do.
The key question to answer is whether you’re creating an object that has
behaviors. Typically, behaviors need some common data, such as a set of
properties to store the initial values of SET
commands in an object that uses
the Init
and Destroy
methods to push/pop the SETtings. If all you need to do is
perform some function (transform some data, for example), use a function. But
when you need more than that, go for the wrapper class.
There are some benefits to a wrapper class, of course. The biggest benefit is the ability to subclass to provide specialized behaviors. Where this is a possibility, it’s worth the overhead.
You wouldn’t think that a little thing like a name would
matter so much. But it does. The name we’re referring to is the Name property
possessed by almost every object you can create in Visual FoxPro. (A few of the
weird marriages of Xbase to OOP, like SCATTER NAME
, produce objects without a
Name property.)
For code-based classes, when you CREATEOBJECT()
an object
whose class definition doesn’t assign a value to the Name property, Visual
FoxPro makes one up for you. That’s nice. Except, prior to VFP 7, it insists on
making it unique (usually, the class name or a variant thereof, followed by one
or more digits, like Form3 or Text17). The problem is, as the number of objects
of that class grows, making sure a name is unique takes longer and longer. The
Microsoft folks say the time grows exponentially. We suspect that’s an
overstatement and that it’s actually geometric. What it ain’t is linear. What
it really ain’t is fast enough. (Before we go any further with this, we should
point out why this applies only to code classes. All VCX-based classes have an
implicit assignment of the Name
property, so there’s never a need to assign a
name at instantiation time.)
We tested in VFP 6 with a pair of very simple classes based
on Custom. One contained nothing. The other contained an explicit assignment to
Name
. With 10 repetitions, the explicitly named class would instantiate so fast
it couldn’t be measured, but the nameless class was fast, too. By 100
repetitions, explicit naming was more than four times as fast. At 1000
repetitions, the explicit version was eight to 10 times faster. At 5000 of each
class, explicit names are about 18 times faster than nameless objects to
instantiate.
The moral of the story here is easy. In VFP 6 and earlier,
always assign a value to the Name
property for any class you write in code.
The happy ending is that Microsoft changed this behavior in
VFP 7 and now, newly instantiated classes simply take the class name as their
Name
.
But, in this case, they’re not. The form property LockScreen
lets you make a series of changes to a form without the individual changes
showing as you go. When you SET LOCKSCREEN
to .F., all the changes appear to
occur simultaneously. Visually, it’s far more consistent.
We were all set to tell you that this is one of those times where the user’s eyes will play tricks on him. He’ll think the update is faster because he doesn’t see the individual changes take place.
But guess what? The update really is faster this way. We
tested a simple form with just a few controls. We changed about 20 properties
of the form and the controls. With LockScreen
set to .T., the updates were very
slightly faster in each case. Surprise—the version that looks better is faster,
too. We’re fairly certain it’s because Windows has to redraw the screen only
once—and screen redraws aren’t the fastest thing that the operating system can
do.
Testing the type of a variable or field is one of those
things we do a lot in our code. Starting in VFP 6, it’s something we can do
faster than ever. The VARTYPE()
function is significantly faster than its
predecessor, TYPE()
. How much faster? With both variables and fields, we
consistently find VARTYPE()
two to three-and-a-half times faster than TYPE()
.
The smallest differences occurred when checking non-existent variables. (A code
maintenance and debugging tip here: Not only is VARTYPE()
faster, but you don’t
have to pass the quoted variable name to VARTYPE()
, eliminating one more of
those needless development errors. VARTYPE(MyVar)
is functionally equivalent to
TYPE("MyVar")
.)
One warning here: VARTYPE()
is appropriate only for fields,
variables, properties and the like. If you use it with expressions to find out
what type the result will be, be aware that you actually evaluate the
expression (in fact, it’s evaluated before it’s even passed to VARTYPE()
). In
addition, VARTYPE()
is useful only for items that can be evaluated. In
particular, you can’t use it to check whether a particular property exists,
which you can do with TYPE()
. For example, VARTYPE(oObject.Name)
fails if
oObject doesn’t exist or doesn’t have a name property. However,
TYPE("oObject.Name")
works, returning “U” in that case. So,
don’t throw TYPE()
out of your toolkit quite yet.
Building up strings has become increasingly important, as we need to create HTML and XML and who-knows-what-else-ML. Fortunately, VFP has been tuned to make construction of large strings extremely efficient.
However, there’s one important trick you need to know. Building a string from left to right is fast; building it from right to left isn’t. What does that mean? It means that a series of assignments in the form:
cString = cString + cMoreStuff
is much faster than:
cString = cMoreStuff + cString
regardless of the actual contents of the variables.
How much faster? That does depend on the contents of the variables or, more specifically, the length of the strings involved. As the string gets larger, the first version gets better and better. To build a string of 30,000 characters, adding three at a time, the first version runs about two orders of magnitude faster than the second.
VFP 6 also introduced a new way to create objects. The
NewObject()
function lets you instantiate objects without worrying about
whether you’ve pointed to the class library ahead of time—instead, you just
include the library name in the call. CREATEOBJECT()
, of course, needs the
library in the current list with either a SET CLASSLIB
or SET PROCEDURE
ahead
of time.
So which way is faster? As usual, the answer is “it
depends.” With VCX-based classes, if you can issue SET CLASSLIB
just once
and then instantiate classes from that library repeatedly, CREATEOBJECT()
is
the way to go. It’s anywhere from four to ten times faster than calling
NewObject()
with the class library. On the other hand, if you need to load the
library each time, the SET CLASSLIB
/CREATEOBJECT()
pair is in the same ballpark
as NewObject()
.
How about for classes written in code? In that case, issuing
a single SET PROCEDURE
and calling CREATEOBJECT()
repeatedly is five to ten
times faster than either NewObject()
or the SET PROCEDURE
/CREATEOBJECT()
pair,
which are pretty similar.
In the VFP 6 version of this book, we reported that instantiating a coded class was a little faster than instantiating a VCX-based class, but not enough faster to wipe out the benefits of developing classes visually. In VFP 7, we see varying results: Sometimes the coded class instantiates faster, while at other times, the VCX-based class is faster. Bottom line: We’ll stand by our advice to develop visual objects visually.
We’re not really surprised that NewObject()
is generally
slower than CREATEOBJECT()
. It’s doing a lot of work behind the scenes. Here’s
the sequence: Save the names of all open class libraries, then close them.
Next, open the specified class library, instantiate the object, and close the
class library. Finally, reopen all the formerly open class libraries. Whew,
that’s a lot of files to find and mess with. Of course, we don’t understand why
VFP doesn’t check whether the specified library is in the current list before
going to all that trouble—seems to us it could speed up NewObject()
considerably, in most cases.
We tested and found no performance penalty for having a lot
of class libraries open, no matter where in the list the class you’re
instantiating is found. So the rule here is to think about how you’re going to
do things before you write the code and, if possible, just keep open the class
libraries you use a lot. Then use NewObject()
for the one-shots, the classes
from libraries you need only once in a while.
It turns out that, in VFP, the answer is “yes.” When you start VFP, it figures out how much memory it ought to be able to use, if it needs it. The number is generally about half as much as the machine actually has. Often, the amount that VFP picks is too much.
How can you have too much memory? Like this, according to our buddy Mac Rubel, who knows more about this topic than anyone else—more even, we suspect, than the folks who wrote VFP. However much memory VFP grabs, it assumes it has that much physical memory to work with. But, because it takes so much memory, it often doesn’t—some of the memory it’s working with is really disk space pretending to be memory, and that’s slow. By decreasing the amount of memory VFP thinks it has available, you ensure that it uses only physical memory. VFP knows what to do when it needs more memory than it has available, and it’s good at that. The last thing you want happening is the operating system swapping virtual (disk) memory for real memory while FoxPro thinks it is using RAM. So, as long as you restrict it to using physical memory, things are fast, fast, fast.
Okay, so how do you that? Use the SYS(3050)
function.
SYS(3050,1)
controls foreground memory, the memory VFP has available when it’s
in charge. SYS(3050,2)
is for background memory—how much memory FoxPro should
have when you’re off doing something else. In either case, you pass it a number
and it rounds that down to a number it likes (multiples of 256), and that’s how
much memory it uses. It even tells you how much it really took.
We’ve been amazed how much of a difference this setting can make. Unfortunately, getting it just right is a matter of trial and error, so try a bunch of different settings until you find the one that seems suited to the way you work.
All of the tips we’ve given you here should speed up your code, but your application on your network with your data is the true test (and “your” refers to the final production environment, not necessarily your development environment). Differences in network throughput, the architecture of your system, the design of your tables, your choice of indexes, the phase of the moon, what’s on TV that night, and so forth, all make significant differences in the performance you see. Our advice is always to examine and benchmark how a particular change affects your system. Keep in mind that a single test isn’t conclusive unless it can be repeated, and you need to repeat tests with caution because FoxPro and the operating system and the network and even your disk controller might be caching information. Also, don’t forget to shut down background processes like your virus scanner and your e-mail client while doing performance testing.
Finally, now that we’ve given you all this advice, keep in
mind that most of the time, you don’t need to worry about a few milliseconds
here or there. Don’t spend a lot of energy optimizing until you know there’s a
problem. Of course, when there’s a clear choice and it’s also good practice
(like using SCAN
instead of DO WHILE
to process a table), write your
applications using the better choice. But, your first task is to get the
application working and producing the right results.
If there’s a speed problem at that point, use the Coverage Profiler to help you figure out where the bottleneck is. Then, follow longtime Fox guru George Goley’s advice and “take out the slow parts.” That’s where the information here should be most useful.