Introduction to SASPy for Using SAS in Python

undefined
 
Use SAS in Python: SASPy
 
Introduction (2)
 
WHARTON 
RESEARCH 
DATA
 
SERVICES
 
Eunji Oh, PhD
June 2020
Agenda
SASPY BASIC FUNCTIONS / JUPYTER MAGIC
Wharton Research Data Services
2
 
SASPy - Basic Functions
 
Start SAS session
Session configuration
Submit SAS jobs in a Python session
Print SAS log
See ODS tables/graphics in a Python IDE
Jupyter Magic
Convert DataFrame to SAS dataset and vice versa
sd2df(method=’CSV’) or sd2df_CSV() for reducing memory
consumption, and speed up the process
Transfer a python variable to SAS macro variable
 
Wharton Research Data Services
 
3
 
Start SAS session
i
m
p
o
r
t
 
s
a
s
p
y
#Connect to a default session, configuration from
sascfg_personaly.py
sas=saspy.
SASsession
(cfgname=‘default’ )
 
 
 
sas.
assigned_librefs() 
# shows assigned libraries
sas.
datasets
(libref=‘library_name’) 
# equivalent to
PROC DATASETS
 
Wharton Research Data Services
 
4
 
Set session name
and establish a
connection
 
Set session configuration options.
Leave it blank to see all available
options, or set 
cfgname
 parameter for
other configurations
 
SAS session configuration check
 
Type the session name in
interpreter prompt
Example: WRDS Jupyter Lab Default
configuration
STDIO (connect to local SAS in
Linux)
SAS session encoding: latin1
Python encoding: latin_1
Output results format: Pandas
dataframe
Process pid number
 
Wharton Research Data Services
 
5
 
SASPy SAS session default configuration
 
Windows PC
-
SAS windows, 
wlatin-1
 as a default encoding type
Linux Desktop
-
SAS linux, 
latin-1
 as a default encoding type
WRDS server
-
WRDS server (WRDS cloud, WRDS Jupyter hub, and etc.)
SAS linux, 
latin-1
 as a default encoding type
SAS u8, 
utf-8
 is also available
 
Wharton Research Data Services
 
6
 
Accessing existing SAS library / datasets
sas.
saslib
(libref=‘home’, path=‘~’) 
# Set libref to home
hr = sas.
sasdata
(table=‘tablename’, libref='mylibref’)
hr.
columnInfo
()
 
# Column-level metadata
hr.
head
() 
# First 5 observations shown
hr.
sort
(by=‘variable’,
 
out=‘home.hr
) 
# SAS proc sort
equivalent
hr_df = hr.
to_df
() 
# Convert to Pandas dataframe
 
Wharton Research Data Services
 
7
 
Other methods: 
add_vars
, 
append
, 
describe
, 
where
, 
hist
and etc.
 (
https://sassoftware.github.io/saspy/api.html#sas-data-
object
)
 
Submit a SAS job
f
r
o
m
 
I
P
y
t
h
o
n
.
d
i
s
p
l
a
y
 
i
m
p
o
r
t
 
H
T
M
L
results = sas.
submit
(
“your SAS code should be here”
)
HTLM(results[‘LST’]) 
# Print Result (ODS output)
print
(results[‘LOG’]) 
# Print SAS log
 
Wharton Research Data Services
 
8
 
The connected SAS session name
 
submit
” method runs a SAS job in the background (SAS
batch mode – no real time logs)
Other methods:
submitLOG
: submit a job and print LOG
submitLST
: submit a job and show results (LST)
 
Converting Data Pandas ↔ SAS
 
For small datasets:
sd2df : SAS dataset to Pandas dataframe
df2sd: Pandas dataframe to SAS dataset
 
For large datasets:
sd2df_CSV: SAS to Pandas via CSV file
sd2df(method=“DISK”): SAS to Pandas via
writing file
 
Wharton Research Data Services
 
9
 
via
memory
 
via
Proc Export
read_csv()
 
Converting Data: SAS ↔ Pandas
 
Avoid converting a huge SAS dataset into a Pandas
DataFrame, instead,
Run your SAS job behind the scene by using 
‘submit’ 
API
Convert only the reasonable size of SAS dataset into Pandas Data
Frame
When SAS dataset is huge, it is recommended to use 
‘sd2df_CSV’
method
Be aware about encoding types
Generating a latin1 dataset from a utf8 source can raise a transcoding
error
Currently, most WRDS SAS datasets have latin1 encoding
 
Wharton Research Data Services
 
10
10
 
Converting Data: Encoding issue
 
How to resolve?
data mylib.mydata 
(encoding=latin1)
;
Or
libname latlib ‘~/mysasdata’ 
outencoding=latin1
inencoding=latin1
;
More info:
https://blogs.sas.com/content/sgf/2014/09/26/encoding-
helping-sas-speak-your-language/
 
Wharton Research Data Services
 
11
11
 
Jupyter Magic for SAS
 
Wharton Research Data Services
 
12
12
 
SAS Log
 
ODS Outputs
 
Jupyter Magic
 
The SAS session name
 
Jupyter Magic for SAS
 
1.
Import SASPy and establish SAS Connection
2.
Start the code with 
%%SAS my_session
3.
Write 
SAS code 
(not Python) in the same cell
4.
The Jupyter Magic executes the code in the cell and
returns SAS Log and Outputs
 
Wharton Research Data Services
 
13
13
 
End session
 
my_session.
endsas()
Works universally on any platform
Terminates the SAS subprocess
Make sure always run this method at the end of your SAS session
 
Wharton Research Data Services
 
14
14
 
Summary
 
SASPy basic functions for data access and SAS job
submission
Jupyter Magic provides SAS-friendly developing
environment
 
Next video
: An example SASPy application with WRDS
data
 
Wharton Research Data Services
 
15
15
undefined
Slide Note
Embed
Share

Explore the functionalities of SASPy for integrating SAS with Python, enabling tasks like starting SAS sessions, submitting jobs, converting data, and transferring variables. Learn about basic functions, Jupyter magic, session configuration, accessing SAS libraries/datasets, and more through this comprehensive guide with practical examples.

  • SASPy
  • SAS in Python
  • Data Analysis
  • Integration
  • Programming

Uploaded on Jul 10, 2024 | 2 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. WHARTON RESEARCH DATASERVICES Use SAS in Python: SASPy Introduction (2) Eunji Oh, PhD June 2020

  2. Agenda SASPY BASIC FUNCTIONS / JUPYTER MAGIC 1 SASPy Basic Functions 2 Jupyter Magic 2 Wharton Research Data Services

  3. SASPy - Basic Functions Start SAS session Session configuration Submit SAS jobs in a Python session Print SAS log See ODS tables/graphics in a Python IDE Jupyter Magic Convert DataFrame to SAS dataset and vice versa sd2df(method= CSV ) or sd2df_CSV() for reducing memory consumption, and speed up the process Transfer a python variable to SAS macro variable 3 Wharton Research Data Services

  4. Start SAS session import saspy #Connect to a default session, configuration from sascfg_personaly.py sas=saspy.SASsession(cfgname= default ) Set session configuration options. Leave it blank to see all available options, or set cfgname parameter for other configurations Set session name and establish a connection sas.assigned_librefs() # shows assigned libraries sas.datasets(libref= library_name ) # equivalent to PROC DATASETS 4 Wharton Research Data Services

  5. SAS session configuration check Type the session name in interpreter prompt Example: WRDS Jupyter Lab Default configuration STDIO (connect to local SAS in Linux) SAS session encoding: latin1 Python encoding: latin_1 Output results format: Pandas dataframe Process pid number 5 Wharton Research Data Services

  6. SASPy SAS session default configuration Windows PC - SAS windows, wlatin-1 as a default encoding type Linux Desktop - SAS linux, latin-1 as a default encoding type WRDS server - WRDS server (WRDS cloud, WRDS Jupyter hub, and etc.) SAS linux, latin-1 as a default encoding type SAS u8, utf-8 is also available 6 Wharton Research Data Services

  7. Accessing existing SAS library / datasets sas.saslib(libref= home , path= ~ ) # Set libref to home hr = sas.sasdata(table= tablename , libref='mylibref ) hr.columnInfo() # Column-level metadata hr.head() # First 5 observations shown hr.sort(by= variable , out= home.hr ) # SAS proc sort equivalent hr_df = hr.to_df() # Convert to Pandas dataframe Other methods: add_vars, append, describe, where, hist and etc. (https://sassoftware.github.io/saspy/api.html#sas-data- object) 7 Wharton Research Data Services

  8. Submit a SAS job The connected SAS session name from IPython.display import HTML results = sas.submit( your SAS code should be here ) HTLM(results[ LST ]) # Print Result (ODS output) print(results[ LOG ]) # Print SAS log submit method runs a SAS job in the background (SAS batch mode no real time logs) Other methods: submitLOG: submit a job and print LOG submitLST: submit a job and show results (LST) 8 Wharton Research Data Services

  9. Converting Data Pandas SAS For small datasets: via memory sd2df : SAS dataset to Pandas dataframe df2sd: Pandas dataframe to SAS dataset For large datasets: sd2df_CSV: SAS to Pandas via CSV file via Proc Export read_csv() sd2df(method= DISK ): SAS to Pandas via writing file 9 Wharton Research Data Services

  10. Converting Data: SAS Pandas Avoid converting a huge SAS dataset into a Pandas DataFrame, instead, Run your SAS job behind the scene by using submit API Convert only the reasonable size of SAS dataset into Pandas Data Frame When SAS dataset is huge, it is recommended to use sd2df_CSV method Be aware about encoding types Generating a latin1 dataset from a utf8 source can raise a transcoding error Currently, most WRDS SAS datasets have latin1 encoding 10 Wharton Research Data Services

  11. Converting Data: Encoding issue How to resolve? data mylib.mydata (encoding=latin1); Or libname latlib ~/mysasdata outencoding=latin1 inencoding=latin1; More info: https://blogs.sas.com/content/sgf/2014/09/26/encoding- helping-sas-speak-your-language/ 11 Wharton Research Data Services

  12. Jupyter Magic for SAS The SAS session name Jupyter Magic ODS Outputs SAS Log 12 Wharton Research Data Services

  13. Jupyter Magic for SAS 1. Import SASPy and establish SAS Connection 2. Start the code with %%SAS my_session 3. Write SAS code (not Python) in the same cell 4. The Jupyter Magic executes the code in the cell and returns SAS Log and Outputs 13 Wharton Research Data Services

  14. End session my_session.endsas() Works universally on any platform Terminates the SAS subprocess Make sure always run this method at the end of your SAS session 14 Wharton Research Data Services

  15. Summary SASPy basic functions for data access and SAS job submission Jupyter Magic provides SAS-friendly developing environment Next video: An example SASPy application with WRDS data 15 Wharton Research Data Services

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#