Monday, August 20, 2012

Counting lines in text files

One of the best ways to enhance error checking in the scripts you write as a DBA is to search for various error messages and codes in log files.  A related question that often comes up during error checking is to count the number of lines in a text file.  The method is different depending on whether you are using Unix or Windows.

On Windows, the best way we've found to count the number of lines in a text file is to run this from the command prompt:

findstr /R /N "^" {filename.txt} | find /C ":"

(Hint:  The ^ character represent the start of a new line, so this is the secret to locating each new line.)

On Unix, we use this method:

wc -l {filename.txt}

or

cat {filename.txt} | wc -l

Saturday, August 18, 2012

Wrapping up a busy week!

What a busy week we've had here at DBConnect Solutions!  News, news, and more news to share on some recent happenings.

First off, we are proud to announce the official release of the first offering in our NightOwl suite of products.  The NightOwl™ Health Check for Oracle JDE is an automated report that compares your JDE Oracle database against our set of Best Practices. The health check reviews security, architecture & topology, backup & recovery, and JDE-specific areas, and produces web-based and PDF reports for your review.  Our official press release can be found on our website in our resources section here.  We've received an exceptional initial response and were thrilled to see our release was picked up by the likes of Yahoo Finance, CNBC, and Market Watch.

We were also happy to be involved in the release of Wave System's new Wave Cloud product.  Wave Cloud is a cloud-based service for enterprise-wide management of self-encrypting drives (SEDs).  DBConnect Solutions was selected to take part in the beta testing program during the implementation of our security policies to ensure our client data is protected.  You can find out more about Wave Cloud by reading the official press release here.

Finally, we are excited to announce that DBConnect Solutions will be hosting a booth at the Oracle OpenWorld convention this year from Monday, October 1st through Wednesday, October 3rd.  Our display will be located in the JD Edwards Pavilion on the 3rd floor lobby of InterContinental Hotel, which is right next to the Moscone West Exhibition Hall.  Stop by to learn more about our offerings, including our NightOwl™ products and receive a free gift.  You can contact us at info@dbconnect.com to learn about your chance to receive a free Oracle OpenWorld Discover pass to gain access to exhibits, keynotes and more.  Stay tuned for more info as we get closer to the conference.

These are exciting times at DBConnect and we're just getting started!  Stop back here often to hear our latest news or sign up on our website to receive updates via email.

Friday, August 17, 2012

When good JDE queries go bad

It's 4 a.m. and your cell phone is ringing again.  You groan because you already know why.  Your JDE job queue is backed up again, and batch jobs are running forever.

You just don't understand it.  These little jobs that normally run in under a minute sometimes take as long as an hour to complete.  And worse yet, your users in Europe have started complaining about system performance when running web applications at the same time your job queue gets backed up.  It just doesn't make any sense.  Or does it?

Unfortunately, the above scenario is fairly common when you use an Oracle database with JDE.

Here's an explanation of what can happen to throw off the Oracle optimizer and cause SQL queries to start behaving very badly.  

By default, Oracle statistics are updated in the nightly maintenance window that begins at 10 p.m.  Any tables that have had enough changes since statistics were previously collected will get new stats collected.  Also by default, Oracle will use an internal algorithm to determine when to invalidate cached SQL cursors.  (This setting can be changed to force an immediate invalidation, if you desire -- DBConnect typically does force immediate invalidation by the way.)

Once new statistics are available for a table, and when any cached cursors are invalidated, new SQL plans will be calculated when an SQL statement next references the given table.  This is a key moment in time, because whatever plan the Oracle engine calculates upon the very first execution of the SQL statement, will then be used for all subsequent executions (as long as the SQL statement remains cached).

If, therefore, a BAD execution plan happens to be calculated, the performance will obviously be bad.

But how can this happen?

This can happen because of something called bind peeking.  Bind peeking occurs when Oracle "peeks" into the actual values used within an SQL statement's bind variables, and makes a decision on an execution plan based on these values.  That's usually a very good thing because it means that Oracle is using the ACTUAL values used in the query to provide the best possible result.

But what if the data in the underlying tables is skewed?

If the underlying data is skewed (a very common issue in JDE databases - think of your companies, divisions, doc types, etc.), then Oracle might make a great choice when an SQL statement is first executed, but the path used for the next execution might be a terrible choice.

Let's look at this example on your Sales Order Detail File, F4211:

Let's say you have approximately 1 million records in this table, and the breakdown of your doc types (SDDCTO) is as follows:

S1 - 10 records
S2 - 20,000 records
S3 - 20,000 records
S4 - 10,000 records
S5 - 10,000 records
S6 - 10,000 records
S7 - 20,000 records
S8 - 10,000 records
S9 - 900,000 records

Now, let's say stats are updated on F4211 at 10:00 p.m., and then a UBE runs at say, 11 p.m., which runs an SQL that looks like this:

SELECT * FROM PRODDTA.F4211 WHERE SDDCTO = :1

If the very first execution of the SQL statement specifies a value of S1 for doc type, then it is very likely an index will be used because of a composite index that includes the SDDCTO field.  This is because there are only 10 records with a value of S1, and Oracle knows about this because stats were just collected.

The SQL statement will run very fast.

Now, let's say the UBE continues and now specifies a value of S9 for doc type.  The execution plan is already cached, so it will use the index on SDDCTO.  This means it will need to do an index range scan to locate 900,000 entries, and then take the resulting 900,000 ROWIDs, and go retrieve those records from the table!  This is extremely inefficient, and it would be far more efficient to do a full scan of the F4211 table.

So what can you do at 4 a.m.?

Force dependent SQL statements to be reparsed!

A very easy way to do this is to issue a null comment on the table.  Since this is a DDL statement, it will force all dependent cursors to be invalidated:

COMMENT ON TABLE PRODDTA.F4211 IS '';

The down side is that the currently-running query will have to complete before Oracle will start using a new plan.  Once the statement has completed, though, Oracle will reparse the SQL statement upon its next execution, and you might just find that the batch jobs take right off, and you can go right back to sleep.

Disclaimer:  We don't consider this a FIX to this type of issue, as much as a troubleshooting tactic when you get into a critical situation.  There are strategies for preventing issues like this from ever presenting themselves, and that's what we recommend our customers do when these types of issues occur.