Skip to main content Skip to secondary navigation

Best Practices for Creating Ad Hoc Queries

Main content start

Best practices:

  • Start with known, vetted logic (like one of our template queries!)
  • Start small and simple, build gradually.  If you’re working on something that can have a large result, it’s often a good idea to first try the logic on a small sample, to make sure it’s working properly.  Sometimes it helps to do an arbitrary limit like filtering only for EMPLIDs starting with ‘059…..’ or only a particular UG cohort.
  • Vet your data; if doing something complex, vet each stage (read more about vetting data below)
  • (Almost) Always include time in your filters
  • Beware of too many filters
  • (Usually) Include filters in the select statement; this helps with documenting results.
  • Ask us for help!!  We’re happy to consult with you. Email: siris-support@lists.stanford.edu

What can go wrong

  • No results
  • Too many results – usually from not filtering enough or not filtering properly
  • Misleading results
  • Inadvertent “drilling” can add filters to your query without your realizing it

Some of the possible causes

  • Forgetting a key filter
  • Including too many filters
  • Selecting a field which changes the grain without your realizing it
  • Counting the wrong thing (e.g. counting plans or counting students?)

Vetting your results:

Check totals against an alternate verified source, if possible, such as

  • Student Analytics Dashboards
  • Published statistics
  • Your own reports/queries
  • PeopleSoft – either the application or via a query to the database

No external source to check against? Try these validation strategies:

  • Run the same analysis using different query approaches
  • Develop 'if-then' hypotheses based on your results (e.g., 'If 200 students declared this major, then course enrollment in required classes should be approximately Y')
  • Test these hypotheses against related data you can verify

Here are some other general strategies:

  1. Checking counts—If you know how many comprise a group, this a good quick check. If you don’t know the counts for a large group, then test a subset (e.g., enrollment in only one course instead of all courses in a subject).
  2. Spot checking—Choose a couple of values in your normal range, and see if the data for those few are correct.
  3. Checking outliers—Know a student who dropped a class and then re-enrolled? That might be a good person to check to see if major information is mapping correctly. Data that look weird are often good test cases.
  4. Checking for recent changes—If you know a data value has changed recently, that’s often a good way to check that the data are being brought in correctly.
  5. Finally, does it make sense? (the “ballpark” audit)—Data that have never been available before can sometimes be hard to assess, but does it fall in line with what you know? Ballpark estimates are useful sanity checks.

Finally, last but not least, consult with IRDS and other campus colleagues to see if your query logic and results are correct.