What’s New¶
These are new features and improvements of note in each release.
v0.19.2 (December 24, 2016)¶
This is a minor bug-fix release in the 0.19.x series and includes some small regression fixes, bug fixes and performance improvements. We recommend that all users upgrade to this version.
Highlights include:
- Compatibility with Python 3.6
- Added a Pandas Cheat Sheet. (GH13202).
What’s new in v0.19.2
Enhancements¶
The pd.merge_asof(), added in 0.19.0, gained some improvements:
Performance Improvements¶
Bug Fixes¶
- Compat with python 3.6 for pickling of some offsets (GH14685)
- Compat with python 3.6 for some indexing exception types (GH14684, GH14689)
- Compat with python 3.6 for deprecation warnings in the test suite (GH14681)
- Compat with python 3.6 for Timestamp pickles (GH14689)
- Compat with
dateutil==2.6.0; segfault reported in the testing suite (GH14621) - Allow
nanosecondsinTimestamp.replaceas a kwarg (GH14621) - Bug in
pd.read_csvin which aliasing was being done forna_valueswhen passed in as a dictionary (GH14203) - Bug in
pd.read_csvin which column indices for a dict-likena_valueswere not being respected (GH14203) - Bug in
pd.read_csvwhere reading files fails, if the number of headers is equal to the number of lines in the file (GH14515) - Bug in
pd.read_csvfor the Python engine in which an unhelpful error message was being raised when multi-char delimiters were not being respected with quotes (GH14582) - Fix bugs (GH14734, GH13654) in
pd.read_sasandpandas.io.sas.sas7bdat.SAS7BDATReaderthat caused problems when reading a SAS file incrementally. - Bug in
pd.read_csvfor the Python engine in which an unhelpful error message was being raised whenskipfooterwas not being respected by Python’s CSV library (GH13879) - Bug in
.fillna()in which timezone aware datetime64 values were incorrectly rounded (GH14872) - Bug in
.groupby(..., sort=True)of a non-lexsorted MultiIndex when grouping with multiple levels (GH14776) - Bug in
pd.cutwith negative values and a single bin (GH14652) - Bug in
pd.to_numericwhere a 0 was not unsigned on adowncast='unsigned'argument (GH14401) - Bug in plotting regular and irregular timeseries using shared axes
(
sharex=Trueorax.twinx()) (GH13341, GH14322). - Bug in not propogating exceptions in parsing invalid datetimes, noted in python 3.6 (GH14561)
- Bug in resampling a
DatetimeIndexin local TZ, covering a DST change, which would raiseAmbiguousTimeError(GH14682) - Bug in indexing that transformed
RecursionErrorintoKeyErrororIndexingError(GH14554) - Bug in
HDFStorewhen writing aMultiIndexwhen usingdata_columns=True(GH14435) - Bug in
HDFStore.append()when writing aSeriesand passing amin_itemsizeargument containing a value for theindex(GH11412) - Bug when writing to a
HDFStoreintableformat with amin_itemsizevalue for theindexand without asking to append (GH10381) - Bug in
Series.groupby.nunique()raising anIndexErrorfor an emptySeries(GH12553) - Bug in
DataFrame.nlargestandDataFrame.nsmallestwhen the index had duplicate values (GH13412) - Bug in clipboard functions on linux with python2 with unicode and separators (GH13747)
- Bug in clipboard functions on Windows 10 and python 3 (GH14362, GH12807)
- Bug in
.to_clipboard()and Excel compat (GH12529) - Bug in
DataFrame.combine_first()for integer columns (GH14687). - Bug in
pd.read_csv()in which thedtypeparameter was not being respected for empty data (GH14712) - Bug in
pd.read_csv()in which thenrowsparameter was not being respected for large input when using the C engine for parsing (GH7626) - Bug in
pd.merge_asof()could not handle timezone-aware DatetimeIndex when a tolerance was specified (GH14844) - Explicit check in
to_stataandStataWriterfor out-of-range values when writing doubles (GH14618) - Bug in
.plot(kind='kde')which did not drop missing values to generate the KDE Plot, instead generating an empty plot. (GH14821) - Bug in
unstack()if called with a list of column(s) as an argument, regardless of the dtypes of all columns, they get coerced toobject(GH11847)
v0.19.1 (November 3, 2016)¶
This is a minor bug-fix release from 0.19.0 and includes some small regression fixes, bug fixes and performance improvements. We recommend that all users upgrade to this version.
What’s new in v0.19.1
Performance Improvements¶
- Fixed performance regression in factorization of
Perioddata (GH14338) - Fixed performance regression in
Series.asof(where)whenwhereis a scalar (GH14461) - Improved performance in
DataFrame.asof(where)whenwhereis a scalar (GH14461) - Improved performance in
.to_json()whenlines=True(GH14408) - Improved performance in certain types of loc indexing with a MultiIndex (GH14551).
Bug Fixes¶
- Source installs from PyPI will now again work without
cythoninstalled, as in previous versions (GH14204) - Compat with Cython 0.25 for building (GH14496)
- Fixed regression where user-provided file handles were closed in
read_csv(c engine) (GH14418). - Fixed regression in
DataFrame.quantilewhen missing values where present in some columns (GH14357). - Fixed regression in
Index.differencewhere thefreqof aDatetimeIndexwas incorrectly set (GH14323) - Added back
pandas.core.common.array_equivalentwith a deprecation warning (GH14555). - Bug in
pd.read_csvfor the C engine in which quotation marks were improperly parsed in skipped rows (GH14459) - Bug in
pd.read_csvfor Python 2.x in which Unicode quote characters were no longer being respected (GH14477) - Fixed regression in
Index.appendwhen categorical indices were appended (GH14545). - Fixed regression in
pd.DataFramewhere constructor fails when given dict withNonevalue (GH14381) - Fixed regression in
DatetimeIndex._maybe_cast_slice_boundwhen index is empty (GH14354). - Bug in localizing an ambiguous timezone when a boolean is passed (GH14402)
- Bug in
TimedeltaIndexaddition with a Datetime-like object where addition overflow in the negative direction was not being caught (GH14068, GH14453) - Bug in string indexing against data with
objectIndexmay raiseAttributeError(GH14424) - Corrrecly raise
ValueErroron empty input topd.eval()anddf.query()(GH13139) - Bug in
RangeIndex.intersectionwhen result is a empty set (GH14364). - Bug in groupby-transform broadcasting that could cause incorrect dtype coercion (GH14457)
- Bug in
Series.__setitem__which allowed mutating read-only arrays (GH14359). - Bug in
DataFrame.insertwhere multiple calls with duplicate columns can fail (GH14291) pd.merge()will raiseValueErrorwith non-boolean parameters in passed boolean type arguments (GH14434)- Bug in
Timestampwhere dates very near the minimum (1677-09) could underflow on creation (GH14415) - Bug in
pd.concatwhere names of thekeyswere not propagated to the resultingMultiIndex(GH14252) - Bug in
pd.concatwhereaxiscannot take string parameters'rows'or'columns'(GH14369) - Bug in
pd.concatwith dataframes heterogeneous in length and tuplekeys(GH14438) - Bug in
MultiIndex.set_levelswhere illegal level values were still set after raising an error (GH13754) - Bug in
DataFrame.to_jsonwherelines=Trueand a value contained a}character (GH14391) - Bug in
df.groupbycausing anAttributeErrorwhen grouping a single index frame by a column and the index level (:issue`14327`) - Bug in
df.groupbywhereTypeErrorraised whenpd.Grouper(key=...)is passed in a list (GH14334) - Bug in
pd.pivot_tablemay raiseTypeErrororValueErrorwhenindexorcolumnsis not scalar andvaluesis not specified (GH14380)
v0.19.0 (October 2, 2016)¶
This is a major release from 0.18.1 and includes number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version.
Highlights include:
merge_asof()for asof-style time-series joining, see here.rolling()is now time-series aware, see hereread_csv()now supports parsingCategoricaldata, see here- A function
union_categorical()has been added for combining categoricals, see here PeriodIndexnow has its ownperioddtype, and changed to be more consistent with otherIndexclasses. See here- Sparse data structures gained enhanced support of
intandbooldtypes, see here - Comparison operations with
Seriesno longer ignores the index, see here for an overview of the API changes. - Introduction of a pandas development API for utility functions, see here.
- Deprecation of
Panel4DandPanelND. We recommend to represent these types of n-dimensional data with the xarray package. - Removal of the previously deprecated modules
pandas.io.data,pandas.io.wb,pandas.tools.rplot.
Warning
pandas >= 0.19.0 will no longer silence numpy ufunc warnings upon import, see here.
What’s new in v0.19.0
- New features
merge_asoffor asof-style time-series joining.rolling()is now time-series awareread_csvhas improved support for duplicate column namesread_csvsupports parsingCategoricaldirectly- Categorical Concatenation
- Semi-Month Offsets
- New Index methods
- Google BigQuery Enhancements
- Fine-grained numpy errstate
get_dummiesnow returns integer dtypes- Downcast values to smallest possible dtype in
to_numeric - pandas development API
- Other enhancements
- API changes
Series.tolist()will now return Python typesSeriesoperators for different indexesSeriestype promotion on assignment.to_datetime()changes- Merging changes
.describe()changesPeriodchanges- Index
+/-no longer used for set operations Index.differenceand.symmetric_differencechangesIndex.uniqueconsistently returnsIndexMultiIndexconstructors,groupbyandset_indexpreserve categorical dtypesread_csvwill progressively enumerate chunks- Sparse Changes
- Indexer dtype changes
- Other API Changes
- Deprecations
- Removal of prior version deprecations/changes
- Performance Improvements
- Bug Fixes
New features¶
merge_asof for asof-style time-series joining¶
A long-time requested feature has been added through the merge_asof() function, to
support asof style joining of time-series (GH1870, GH13695, GH13709, GH13902). Full documentation is
here.
The merge_asof() performs an asof merge, which is similar to a left-join
except that we match on nearest key rather than equal keys.
In [1]: left = pd.DataFrame({'a': [1, 5, 10],
...: 'left_val': ['a', 'b', 'c']})
...:
In [2]: right = pd.DataFrame({'a': [1, 2, 3, 6, 7],
...: 'right_val': [1, 2, 3, 6, 7]})
...:
In [3]: left
Out[3]:
a left_val
0 1 a
1 5 b
2 10 c
In [4]: right