OraQA

Oracle Question and Answer

  • Do you have a solution to a problem? Do you have an unanswered question? Login and share it with the Oracle community. More...

Oracle News


Entries RSS feed

Comments RSS feed

How to generate a histogram using SQL

March 6th, 2006 By Eddie Awad

To generate an output like the following, where each employee in a department is represented by a “+”:

DEPARTMENT_NAME                EMPLOYEES
------------------------------ ----------------------------------------------
Administration                 +
Accounting                     ++
Purchasing                     ++++++
Human Resources                +
IT                             +++++
Public Relations               +
Executive                      +++
Shipping                       +++++++++++++++++++++++++++++++++++++++++++++
Sales                          ++++++++++++++++++++++++++++++++++
Finance                        ++++++
Marketing                      ++

Use the following query:

select d.department_name,
lpad('+',count(*),'+') as employees
from employees e, departments d
where e.department_id = d.department_id
group by d.department_name

Note that you can also use rpad instead of lpad.

Read more about generating strings here and here.

4 Responses to “How to generate a histogram using SQL”

  1. Karl Reitschuster Says:

    Hi Eddie,
    your example shows very good how a Frequency Histogram works. Do you have an example(sql) / idea how to generate a height based histogram with your pseudo graphical output?

    from the Oracle Server Performance Guide :

    Frequency Histograms
    In a frequency histogram, each value of the column corresponds to a single bucket of
    the histogram. Each bucket contains the number of occurrences of that single value.
    Frequency histograms are automatically created instead of height-balanced histograms
    when the number of distinct values is less than or equal to the number of histogram
    buckets specified. Frequency histograms can be viewed using the *TAB_HISTOGRAMS
    tables, as shown in Example 14–2.

    Height-Balanced Histograms
    In a height-balanced histogram, the column values are divided into bands so that each
    band contains approximately the same number of rows. The useful information that
    the histogram provides is where in the range of values the endpoints fall.

    Greetings
    Karl

  2. Eddie Awad Says:

    Well, an example of the data behind a height-balanced histogram (from the HR schema in my XE database):

    SELECT endpoint_number, endpoint_value
      FROM ALL_HISTOGRAMS
     WHERE table_name = 'EMPLOYEES' and column_name = 'SALARY'
      ORDER BY endpoint_number
    
    ENDPOINT_NUMBER        ENDPOINT_VALUE
    ---------------------- ----------------------
    0                      2100
    1                      2600
    2                      2900
    3                      3200
    4                      4200
    5                      6400
    6                      7400
    7                      8300
    8                      9500
    9                      11500
    10                     24000                  
    
    11 rows selected
    

    How do you imagine the result set above be represented graphically?

  3. Karl Reitschuster Says:

    Sorry for confusion. I thought about an example out of the HR Data to generate a Height based Histogram. For example your Frequncy histogram has 11 Buckets - the departments. If there would be thousands of departments you would need to aggregate that with fewer buckets. I will try an example out of the HR-Schema with the same data when i have time to do this.

    Background is that the understanding of these tow historgram type would increase with these examples.

    Greetings
    Karl

  4. gamyers Says:

    You can derive a height based histogram easily enough with analytics. However I don’t see a useful way of displaying that graphically.

    select bucket, min(object_type), max(object_type), count(*),
    count(distinct object_type)
    from  (select object_type, object_name,
    ntile(10) over (order by object_type, object_name) bucket
    from all_objects where owner = 'SYSTEM')
    group by bucket;
    

Leave a Reply

You must be logged in to post a comment.

RSS feed for comments on this question