Saving you 15 minutes: CyberArk ‘Pass Through’ SQL-Plus Connections, Java Scripts, Listener Settings, Services, TNS Errors

I’m currently doing some work around CyberArk and using it for privilege password management for the Oracle estate.  Once of the things CyberArk can be configured for is to use SQL-Plus ‘pass through’ connections out of the box.  What this means is that you can login to CyberArk, select a database and then click ‘connect’.  This will then launch a SQL-Plus session without having to release the password.

The launching of SQL-Plus is done using a combination of things, Java Script, CyberArk Database Settings and an Oracle Instant Client.  CyberArk stores the following information; SQL Plus Instant Client Location, Address, Database and Port.  The Oracle Instant Client is installed locally to your desktop and laptop.  The java scripts takes the information stored in CyberArk and then launches the SQL Plus session.

When we initially configured this and launched SQL Plus we got the following TNS Error, ORA-12504.  The error specifies that the listener wasn’t given the service name as part of the connection string.  This is odd as the field for the database name was definitely configured and saved.

A bit of head scratching led to me to this listener.ora parameter:

DEFAULT_SERVICE_LISTENER=(DB1)

This parameter can be used to allow a TNS connection to specify just a host and nothing else, why you would want to do such a thing is a bit of a mystery, but it allowed the SQL Plus pass through connection to work as the service name was provided by the listener and not the client.  Have a look at MOS 556996.1 for more details.

Getting this to work generally was a result but the workaround of setting this parameter across the estate wasn’t acceptable.  So I decided to have a look at the Java Script.  Probably like most DBAs I have debugged enough code, shell scripts, perl scripts, pizza menus to be able to at least have a crack at Java script.    

Here is an excerpt from the script.

var sCommand = "\"" + SQLPlusPath + "\" " + username + "/" + password + "@" + address;

So it seems that the default connection string for SQL Plus CyberArk pass through connections uses only the address field from the CyberArk application.  This really means that the database and port information are basically irrelevant in forming the database connection string.  So in CyberArk you would think you would specify this:

Address:       ldnserver01
Port:          1521
Database:      DB1

You need to specify this:

Address:       ldnserver01:1521/DB1
Port:          1521 (Fill it out for completeness but it's not used)
Database:      DB1 (Fill it out for completeness but it's not used)

As Gordan Ramsey might say, CyberArk, SQL Plus Pass Through…..DONE

TweetMe: pbedba

 

Saving You 15 Minutes: MGMT$JOB_STEP_HISTORY and Security

This is a very handy view for determining what steps occured on a particular job, you can use for troubleshooting job failures.  More importantly though this is a clearly readable column from a tool like SQL Developer (its a clob) so if you are using OS / or DB jobs in OEM which contain usernames and passwords either on specifc targets or across the estate be mindful that these details will remain in this view for sometime.  I have jobs going back to 7th August last year, I will keep an eye on this date to see if these are purged on a daily or weekly basis.

TweetMe: pbedba

 

 

 

 

 

Saving You 15 Minutes: SVC SSDs Flash v’s Exadata Gathering The Stats Part 3

Now we have the statistics pre and post move to SSDs we can compare those statistics to the Exadata queries.  I have taken two queries and looked at the execution times.

EXA_SSDspng4

So the results show that there is an improvement in query performance when you move to SSDs, however what this really demonstrates is the Exadata query offloading.  Looking into the details of the queries in DBA_HIST_SNAPSHOT I can see both of these queries were looking to read around 150GB of data to obtain their results.  Those same queries in Exadata had around 98% to 99% query off-loading capability, meaning they were returning around 2 to 3 GB back to the compute node for processing.  The benefits of the query offloading is quite clear.  Looking at the statistics more generally the move to SSDs provided a general performance gain of a one fifth, which isn’t bad for a seamless bit of storage black magic, i.e. no outage and simple implementation.  However what is clear is the underlying I/O sub-system is still having to cart around huge amounts of data.  No matter how fast your disks are there is only going to be a certain amount of performance benefit using SSDs and it is probably worth quantifying the I/O profile of your system to understand the performance benefits of SSDs, i.e. slow I/O or lots of I/O or both!  The query offloading improved performance around 62 times…enough said!

Tweet Me: pbedba

 

Saving You 15 Minutes: SVC SSDs Flash v’s Exadata Gathering The Stats Part 2

So although the last query provided the data we required, it was lacking in two areas:

  • It wasn’t possible to compare pre / post times as the results would be different depending on what we selected to compare, i.e. we needed averages
  • The query returned data that we didn’t require, i.e. queries that took over a certain amount of time but were executed either before the move to SSDs or after, not both.

So to resolve these two issues I am going to use some analytic functions.  Again, I would stress this is SQL written to obtain some information for a one-off, you could probably re-write this, however as a form of building blocks it is probably easier to understand this way.

This is the original query:

select snap_id, 'POST', SQL_ID,
round((elapsed_time_total/1000000),2) ETL,
rows_processed_total, rows_processed_total/round((elapsed_time_total/1000000),2) rows_per_sec, round((cpu_time_total/1000000),2),
round((iowait_total/1000000),2) 
from dba_hist_sqlstat where sql_id in
      (
       select sql_id from (
                    select snap_id,
                    sql_id,
                    plan_hash_value, round((elapsed_time_total/1000000),2), rows_processed_total
                    from dba_hist_sqlstat
                    where executions_total=1
                    and snap_id < 59045
                    and rows_processed_total > 0
                    and round((elapsed_time_total/1000000),2) > 320
                          )
      )
and snap_id > 59165
and rows_processed_total > 0
and executions_total=1
union all
select snap_id, 'PRE', SQL_ID, round((elapsed_time_total/1000000),2) ETL, rows_processed_total, rows_processed_total/round((elapsed_time_total/1000000),2) rows_per_sec, round((cpu_time_total/1000000),2), round((iowait_total/1000000),2)  from dba_hist_sqlstat
where executions_total=1
and snap_id < 59045
and rows_processed_total > 0
and round((elapsed_time_total/1000000),2) > 320
order by SQL_ID, SNAP_ID

The way we will build this up to get the data we want is to use a number of analytical wraps.

First Wrap: Getting the Averages

select 
distinct sql_id, 
when, 
avg(ETL) OVER (partition by WHEN, SQL_ID) as AVG_ELAPSED_TIME, 
avg(rows_processed_total) OVER (partition by WHEN, SQL_ID) as
AVG_ROWS_PROCESSED, avg(rows_per_sec) OVER (partition by WHEN, SQL_ID)
AVG_ROWS_PSEC, avg(cpu_time_total) OVER (partition by WHEN, SQL_ID)
AVG_CPU_TIME, avg(iowait_total) OVER (partition by WHEN, SQL_ID) AVG_IO_TIME 
FROM (original query)

We are using the avg analytic function.  The average function partitions over WHEN and SQL_ID.  To put this in English we want the averages for SQL, PRE SSDs and POST SSDs.  The hard coded PRE and POST columns in the original query are given an alias WHEN.  So when this query selects the data it can average statistics for PRE SSDs and POST SSDs times.  The output looks like this:

EXA_SSDspng1

Second Wrap: Getting only the required data

Removing the required data from a query once the data has already been retrieved is BAD.  Hence the original comment about a re-write.  However based on what we have got I want to get rid of any SQL which hasn’t been executed in the PRE and POST phase.

select 
sql_id, 
when, 
count(SQL_ID) OVER (partition by SQL_ID) SQL_COUNT, 
AVG_ELAPSED_TIME,
AVG_ROWS_PROCESSED, 
AVG_ROWS_PSEC,  
AVG_CPU_TIME, 
AVG_IO_TIME from (firstwrap(original query))

Based on the first wrap query I know that a SQL_ID executed in the PRE and POST stage will appear twice.  If I put the count function, as an analytical function, into the first query it would count the all SQL_IDs from the original query, hence it wouldn’t give me the correct output.  By placing the count here I only get the count of ONE or TWO as I have already done some aggregation.  I am therefore only interested in queries with a TWO count, i.e. appeared twice.

EXA_SSDspng2

Third Wrap: Final query

select * from 
(secondwrap(firstwrap(originalquery))
where SQL_COUNT=2

So this is what the final output looks like. It provides exactly the information we require.  The next step is to look at the results!

EXA_SSDspng3

Making my old HP OVM Server go a whole lot faster

I’m a massive believer in getting good use out of old hardware which although may be not great for production usage can make a fantastic piece of dev/test or training equipment especially when combined with Oracle VM. We had a couple of HP Servers sitting around doing very little but after combining them into 1 I had a nice little server with lots of cores, lots of RAM and lots of disks. Not cutting edge but good enough and more importantly the ability to run 6 x 2 node RAC clusters for internal training purposes.

Setting up OVM Server doesn’t take long at all and it recognised the RAID drive I build using the RAID controller in the server for the OVS repository with no issues at all. So I went about creating some templates for the training I had setup with some of the team. For this kind of environment where having lots of space is more important than performance I opt for RAID 5. However, the performance was really, really slow. Think taking about 2 hours to copy the oracle home from node 1 to node 2. Even RAID 5 shouldn’t be this slow.

So, I found a working HP Array Configuration Utility in Version 9.40.12.0 which just installed into OVM Server with a simple ‘rpm -ivh’ and fired it up:

=> ctrl all show detail

Smart Array P410 in Slot 1
Bus Interface: PCI
Slot: 1
RAID 6 (ADG) Status: Disabled
Controller Status: OK
Hardware Revision: C
Firmware Version: 2.74
Rebuild Priority: Medium
Expand Priority: Medium
Surface Scan Delay: 15 secs
Surface Scan Mode: Idle
Queue Depth: Automatic
Monitor and Performance Delay: 60 min
Elevator Sort: Enabled
Degraded Performance Optimization: Disabled
Inconsistency Repair Policy: Disabled
Wait for Cache Room: Disabled
Surface Analysis Inconsistency Notification: Disabled
Post Prompt Timeout: 0 secs
Cache Board Present: True
Cache Status: OK
Cache Ratio: 100% Read / 0% Write
Drive Write Cache: Disabled
Total Cache Size: 256 MB
Total Cache Memory Available: 144 MB
No-Battery Write Cache: Disabled
Battery/Capacitor Count: 0
SATA NCQ Supported: True

So I could see that all the cache was for Read, Drive Write Cache was disabled and No-battery Write Cache was also disabled (probably because I had not batteries installed). Basically nothing there going to help my write speed in the slightest. As it’s a server for test and training then I want as much performance as possible and as much storage as possible. If I lose power and lose some data then to be honest I’m not really bothered. My templates will still be there, static and intact and if the worst case scenario came to be then installing OVM Server again is not much of a hardship.

So, full steam ahead and off with the following commands to turn on all the caching possible.

ctrl slot=1 modify nbwc=enable
ctrl slot=1 modify dwc=enable forced
ctrl slot=1 modify cacheratio=25/75

Which led to the following:

=> ctrl all show detail

Smart Array P410 in Slot 1
Bus Interface: PCI
Slot: 1
RAID 6 (ADG) Status: Disabled
Controller Status: OK
Hardware Revision: C
Firmware Version: 2.74
Rebuild Priority: Medium
Expand Priority: Medium
Surface Scan Delay: 15 secs
Surface Scan Mode: Idle
Queue Depth: Automatic
Monitor and Performance Delay: 60 min
Elevator Sort: Enabled
Degraded Performance Optimization: Disabled
Inconsistency Repair Policy: Disabled
Wait for Cache Room: Disabled
Surface Analysis Inconsistency Notification: Disabled
Post Prompt Timeout: 0 secs
Cache Board Present: True
Cache Status: OK
Cache Ratio: 25% Read / 75% Write
Drive Write Cache: Enabled
Total Cache Size: 256 MB
Total Cache Memory Available: 144 MB
No-Battery Write Cache: Enabled
Battery/Capacitor Count: 0
SATA NCQ Supported: True

The outcome, well, copying the DB home from node 1 to node 2 took a total of 9 minutes and not hours. Quite a massive improvement and shows just how important caching is. I also found a working version of the HP Array Configuration Utility for OVM Server and I now have a super fast training environment which came from combining a couple of old servers and a little bit of time.