defpreprocess_news(news_train):# Remove {} and '' from assetCodes columnnews_train['assetCodes']=news_train['assetCodes'].apply(lambdax:x[1:-1].replace("'",""))returnnews_trainnews_data=preprocess_news(news_data)
defunstack_asset_codes(news_train):codes=[]indexes=[]fori,valuesinnews_train['assetCodes'].iteritems():explode=values.split(", ")codes.extend(explode)repeat_index=[int(i)]*len(explode)indexes.extend(repeat_index)index_df=pd.DataFrame({'news_index':indexes,'assetCode':codes})delcodes,indexesreturnindex_dfindex_df=unstack_asset_codes(news_data)defmerge_news_on_index(news_train,index_df):news_train['news_index']=news_train.index.copy()# Merge news on unstacked assetsnews_unstack=index_df.merge(news_train,how='left',on='news_index')news_unstack.drop(['news_index','assetCodes'],axis=1,inplace=True)returnnews_unstacknews_data=merge_news_on_index(news_data,index_df)delindex_df
C:\Users\chinn\Anaconda3\lib\site-packages\numpy\lib\nanfunctions.py:1019: RuntimeWarning: Mean of empty slice
return np.nanmean(a, axis, out=out, keepdims=keepdims)
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:10: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
# Remove the CWD from sys.path while we load stuff.
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: FutureWarning: 'argmax' is deprecated, use 'idxmax' instead. The behavior of 'argmax'
will be corrected to return the positional maximum in the future.
Use 'series.values.argmax' to get the position of the maximum now.
if sys.path[0] == '':
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
if sys.path[0] == '':
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:14: FutureWarning: 'argmax' is deprecated, use 'idxmax' instead. The behavior of 'argmax'
will be corrected to return the positional maximum in the future.
Use 'series.values.argmax' to get the position of the maximum now.
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:14: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:10: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
# Remove the CWD from sys.path while we load stuff.
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: FutureWarning: 'argmax' is deprecated, use 'idxmax' instead. The behavior of 'argmax'
will be corrected to return the positional maximum in the future.
Use 'series.values.argmax' to get the position of the maximum now.
if sys.path[0] == '':
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:12: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
if sys.path[0] == '':
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:14: FutureWarning: 'argmax' is deprecated, use 'idxmax' instead. The behavior of 'argmax'
will be corrected to return the positional maximum in the future.
Use 'series.values.argmax' to get the position of the maximum now.
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:14: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
C:\Users\chinn\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py:625: DataConversionWarning: Data with input dtype float32, float64 were all converted to float64 by StandardScaler.
return self.partial_fit(X, y)
C:\Users\chinn\Anaconda3\lib\site-packages\sklearn\base.py:462: DataConversionWarning: Data with input dtype float32, float64 were all converted to float64 by StandardScaler.
return self.fit(X, **fit_params).transform(X)
C:\Users\chinn\Anaconda3\lib\site-packages\ipykernel_launcher.py:4: DataConversionWarning: Data with input dtype float32, float64 were all converted to float64 by StandardScaler.
after removing the cwd from sys.path.