Sorting

Since data in CAS can be spread across many machines and may even be redistributed depending on events that occur, the data is not stored in an ordered form. In general, when using statistical actions, this doesn’t make much difference since the CAS actions doing the work will handle the data regardless of the order that it is in. However, when you are bringing a table of data back to the client from CAS using the fetch action, you may want to have it come back in a sorted form.

In [1]: conn = swat.CAS(host, port, username, password)

In [2]: tbl = conn.read_csv('https://raw.githubusercontent.com/'
   ...:                     'sassoftware/sas-viya-programming/master/data/cars.csv')
   ...: 
NOTE: Cloud Analytic Services made the uploaded file available as table TMPWW1WGLQK in caslib CASUSER(kesmit).
NOTE: The table TMPWW1WGLQK has been created in caslib CASUSER(kesmit) from binary data uploaded to Cloud Analytic Services.

In [3]: tbl.fetch(to=5)
Out[3]: 
[Fetch]

 Selected Rows from Table TMPWW1WGLQK
 
     Make           Model   Type Origin   ...   MPG_Highway  Weight  Wheelbase  Length
 0  Acura             MDX    SUV   Asia   ...          23.0  4451.0      106.0   189.0
 1  Acura  RSX Type S 2dr  Sedan   Asia   ...          31.0  2778.0      101.0   172.0
 2  Acura         TSX 4dr  Sedan   Asia   ...          29.0  3230.0      105.0   183.0
 3  Acura          TL 4dr  Sedan   Asia   ...          28.0  3575.0      108.0   186.0
 4  Acura      3.5 RL 4dr  Sedan   Asia   ...          24.0  3880.0      115.0   197.0
 
 [5 rows x 15 columns]

+ Elapsed: 0.00737s, user: 0.005s, sys: 0.001s, mem: 2.02mb

In [4]: tbl.fetch(to=5, sortby=['MSRP'])
Out[4]: 
[Fetch]

 Selected Rows from Table TMPWW1WGLQK
 
       Make             Model   Type Origin   ...   MPG_Highway  Weight  Wheelbase  Length
 0      Kia    Rio 4dr manual  Sedan   Asia   ...          33.0  2403.0       95.0   167.0
 1  Hyundai  Accent 2dr hatch  Sedan   Asia   ...          33.0  2255.0       96.0   167.0
 2   Toyota   Echo 2dr manual  Sedan   Asia   ...          43.0  2035.0       93.0   163.0
 3   Saturn          Ion1 4dr  Sedan    USA   ...          35.0  2692.0      103.0   185.0
 4      Kia      Rio 4dr auto  Sedan   Asia   ...          32.0  2458.0       95.0   167.0
 
 [5 rows x 15 columns]

+ Elapsed: 0.0167s, user: 0.011s, sys: 0.005s, mem: 6.78mb

Of course, it is possible to set the direction of the sorting as well.

In [5]: tbl.fetch(to=5, sortby=[{'name':'MSRP', 'order':'descending'}])
Out[5]: 
[Fetch]

 Selected Rows from Table TMPWW1WGLQK
 
             Make                  Model    Type   ...    Weight Wheelbase  Length
 0        Porsche            911 GT2 2dr  Sports   ...    3131.0      93.0   175.0
 1  Mercedes-Benz              CL600 2dr   Sedan   ...    4473.0     114.0   196.0
 2  Mercedes-Benz  SL600 convertible 2dr  Sports   ...    4429.0     101.0   179.0
 3  Mercedes-Benz           SL55 AMG 2dr  Sports   ...    4235.0     101.0   179.0
 4  Mercedes-Benz              CL500 2dr   Sedan   ...    4085.0     114.0   196.0
 
 [5 rows x 15 columns]

+ Elapsed: 0.017s, user: 0.008s, sys: 0.009s, mem: 6.78mb

If you are using the pandas.DataFrame style API for CASTable, you can also use the sort_values() method on CASTable objects.

In [6]: sorttbl = tbl.sort_values(['MSRP'])

In [7]: sorttbl.head()
Out[7]: 
Selected Rows from Table TMPWW1WGLQK

      Make             Model   Type Origin   ...   MPG_Highway  Weight  Wheelbase  Length
0      Kia    Rio 4dr manual  Sedan   Asia   ...          33.0  2403.0       95.0   167.0
1  Hyundai  Accent 2dr hatch  Sedan   Asia   ...          33.0  2255.0       96.0   167.0
2   Toyota   Echo 2dr manual  Sedan   Asia   ...          43.0  2035.0       93.0   163.0
3   Saturn          Ion1 4dr  Sedan    USA   ...          35.0  2692.0      103.0   185.0
4      Kia      Rio 4dr auto  Sedan   Asia   ...          32.0  2458.0       95.0   167.0

[5 rows x 15 columns]

In [8]: sorttbl = tbl.sort_values(['MSRP'], ascending=False)

In [9]: sorttbl.head()
Out[9]: 
Selected Rows from Table TMPWW1WGLQK

            Make                  Model    Type   ...    Weight Wheelbase  Length
0        Porsche            911 GT2 2dr  Sports   ...    3131.0      93.0   175.0
1  Mercedes-Benz              CL600 2dr   Sedan   ...    4473.0     114.0   196.0
2  Mercedes-Benz  SL600 convertible 2dr  Sports   ...    4429.0     101.0   179.0
3  Mercedes-Benz           SL55 AMG 2dr  Sports   ...    4235.0     101.0   179.0
4  Mercedes-Benz              CL500 2dr   Sedan   ...    4085.0     114.0   196.0

[5 rows x 15 columns]

As previously mentioned, this doesn’t affect anything in the table on the CAS server itself, it merely stores away the sort keys and applies them when data is fetched (either through fetch directly, or any method that calls fetch in the background).