Split Table Data Variables and Apply Functions
This example shows how to split power outage data from a table into groups by region
and cause of the power outages. Then it shows how to apply functions to calculate
statistics for each group and collect the results in a table.
Load Power Outage Data
The sample file, outages.csv, contains data representing electric utility outages in the
United States. The file contains six columns: Region, OutageTime, Loss, Customers,
RestorationTime, and Cause. Read outages.csv into a table.
T = readtable('outages.csv');
Convert Region and Cause to categorical arrays, and OutageTime and
RestorationTime to datetime arrays. Display the first five rows.
T.Region = categorical(T.Region);
T.Cause = categorical(T.Cause);
T.OutageTime = datetime(T.OutageTime);
T.RestorationTime = datetime(T.RestorationTime);
T(1:5,:)
ans=5×6 table
Region OutageTime Loss Customers RestorationTime Cause
_________ ________________ ______ __________ ________________ _______________
SouthWest 2002-02-01 12:18 458.98 1.8202e+06 2002-02-07 16:50 winter storm
SouthEast 2003-01-23 00:49 530.14 2.1204e+05 NaT winter storm
SouthEast 2003-02-07 21:15 289.4 1.4294e+05 2003-02-17 08:14 winter storm
West 2004-04-06 05:44 434.81 3.4037e+05 2004-04-06 06:10 equipment fault
MidWest 2002-03-16 06:18 186.44 2.1275e+05 2002-03-18 23:23 severe storm
Calculate Maximum Power Loss
Determine the greatest power loss due to a power outage in each region. The
findgroups function returns G, a vector of group numbers created from T.Region. The
splitapply function uses G to split T.Loss into five groups, corresponding to the five
regions. splitapply applies the max function to each group and concatenates the
maximum power losses into a vector.
G = findgroups(T.Region);
maxLoss = splitapply(@max,T.Loss,G)
Split Table Data Variables and Apply Functions